From binjiang at alcatel-lucent.com Wed Oct 10 00:58:28 2007 From: binjiang at alcatel-lucent.com (Jiang, Bin (Bin)) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] What's the difference between XSD/e and XSD, when I using XSD with expat as the underlying parser Message-ID: Hi Boris & all, What's the difference between XSD/e and XSD, if I config XSD with expat as the underlying parser? More specifically, are they same in schema validation capability, performance (memory/CPU usage), etc? Thank you! Thanks, Jiang Bin (Bin) GLMS Developer Alcatel-Lucent From boris at codesynthesis.com Wed Oct 10 02:19:54 2007 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] What's the difference between XSD/e and XSD, when I using XSD with expat as the underlying parser In-Reply-To: References: Message-ID: <20071010061954.GA28418@karelia> Hi, Jiang, Bin (Bin) writes: > What's the difference between XSD/e and XSD, if I config XSD with expat > as the underlying parser? > More specifically, are they same in schema validation capability, > performance (memory/CPU usage), etc? The main difference is the ability of XSD/e to work without many C++ features, such as exceptions, STL, RTTI, iostream, and templates. As a result, XSD/e-generated code is smaller, and can be compiled with older, legacy compilers, especially if some or all of the above C++ features are disabled. On the other hand, XSD provides some extra features, such as different underlying parsers and configurable character type (char or wchar_t). XSD/e and XSD are the same in schema validation capabilities and should have roughly equivalent performance when configured similarly (that is, XSD/e is configured with exceptions, stl, etc.). To put this in more general terms, in XSD/e portability, low footprint, and performance are prioritized. While in XSD convenience, ease of use, and performance are prioritized. Boris From binjiang at alcatel-lucent.com Sun Oct 21 22:21:35 2007 From: binjiang at alcatel-lucent.com (Jiang, Bin (Bin)) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] How to config xsde to parse without namespace check / How to gain a better error message Message-ID: Hi Boris & all, Sometimes I need parsing a xml fragment without checking the namespace, it there any convenient way to do this? I've used XSD 2.1.1 2.3.0, and now xsde 1.1.0, my way is modifiy the xsd library files and the generated parser files, Take xsde 1.1.0 as an example, I will modify xsde/cxx/parser/elements.hxx xsde/cxx/parser/elements.hxx to add a flag in class parser_base to control whether do namespace checking during parsing: struct parser_base { // .... bool getNamespaceCheck() { return bNamespaceCheck; } void setNamespaceCheck(bool b) { bNamespaceCheck = b; } private: bool bNamespaceCheck; // .... }; And then I will modify the generated parser files as below: Before: if (n == "display-name" && ns == "urn:oma:xml:poc:list-service" ) ) After: if (n == "display-name" && ( !getNamespaceCheck() || ns == "urn:oma:xml:poc:list-service" ) ) By doing this, I can config whether doing namespace check when constructing the parsers. But is there a better way to do this? I can't modify the schemas definitions, cause I still need namespace checking when parsing a WHOLE xml document, Only with xml fragments I don't want namespace checking. Another question is when there is an exception, like unexpected element or attribute, how to get the namespace and name of the encountered element or attribute, I only find there is line/column and text(). Thank you! Thanks, Jiang Bin (Bin) GLMS Developer Alcatel-Lucent From boris at codesynthesis.com Mon Oct 22 12:04:31 2007 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] How to config xsde to parse without namespace check / How to gain a better error message In-Reply-To: References: Message-ID: <20071022160431.GB8418@karelia> Hi Bin, Jiang, Bin (Bin) writes: > Sometimes I need parsing a xml fragment without checking the namespace, > it there any convenient way to do this? > I've used XSD 2.1.1 2.3.0, and now xsde 1.1.0, my way is modifiy the xsd > library files and the generated parser files. I think there is a more elegant way to accomplish this. The idea is to override the _start_element (and _start_attribute if necessary) low-level hook on the root parser and call the original version with a proper namespace when necessary: virtual void _start_element (const xml_schema::ro_string& ns, const xml_schema::ro_string& name) { if (need_to_add_namespace) { xml_schema::ro_string ns ("urn:oma:xml:poc:list-service"); base::_start_element (ns, name); } else { base::_start_element (ns, name); } } This will work because all events are going through the root element parser. This can get a bit more complicated if your vocabulary mixes qualified and unqualified elements. But normally it is either all qualified or only root element that is qualified and both of these cases are easy to handle with this method. > Another question is when there is an exception, like unexpected element > or attribute, how to get the namespace and name of the encountered > element or attribute, I only find there is line/column and text(). The current error propagation architecture makes it costly to pass this information up to the caller. Because of that we decided not to provide it. However, we are planning to change the inner workings of C++/Parser which will also make it fairly cheap to pass extra error information around and we will add support for names and namespaces then. Unfortunately, this is not planned for XSD/e 2.0.0 (due in a couple of weeks) and will be implemented in XSD/e 2.1.0 which is scheduled for the end of 2007 - beginning of 2008. How urgent is this feature for your project? Boris From boris at codesynthesis.com Wed Oct 24 12:45:09 2007 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] How to config xsde to parse without namespace check / How to gain a better error message In-Reply-To: References: <20071022160431.GB8418@karelia> Message-ID: <20071024164509.GC861@karelia> Hi Bin, Jiang, Bin (Bin) writes: > [Jiang, Bin (Bin)] This may not work if there is more than one > namespace definitions in a instance document, right? It can still work but it gets harder. You will need to know which element is in which namespace (e.g., have a map of element names to namespaces). Also, since there are several namespaces involved, all but one must have some prefix assigned to them. You can configure Expat to ignore namespaces in which case it will pass names with namespace prefixes (e.g., "nsp:name"). You can use the prefix to figure out the corresponding namespace. > [Jiang, Bin (Bin)] It's not so urgent now, since there is > line/column information and we can always locate the error > position. But if I can provide name or namespace, the error > message would be more user-friendly and complete. Is there > any work-around I can use? I can't think of an easy way to make it work. The schema validation code is generated by the compiler so you will have to modify that. I guess the easiest way is to use XSD in the meantime (which uses exceptions to propagate errors and includes name/namespace information). Boris From binjiang at alcatel-lucent.com Wed Oct 24 12:40:50 2007 From: binjiang at alcatel-lucent.com (Jiang, Bin (Bin)) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] How to config xsde to parse without namespace check / How to gain a better error message In-Reply-To: <20071022160431.GB8418@karelia> References: <20071022160431.GB8418@karelia> Message-ID: Hi Boris, Really appreciate your timely and detailed response, thank you! Please see my more questions below. Thanks, Jiang Bin (Bin) GLMS Developer Alcatel-Lucent > -----Original Message----- > From: Boris Kolpackov [mailto:boris@codesynthesis.com] > Sent: 2007?10?23? 0:05 > To: Jiang, Bin (Bin) > Cc: xsde-users@codesynthesis.com > Subject: Re: [xsde-users] How to config xsde to parse without namespace > check / How to gain a better error message > > Hi Bin, > > Jiang, Bin (Bin) writes: > > > Sometimes I need parsing a xml fragment without checking the namespace, > > it there any convenient way to do this? > > I've used XSD 2.1.1 2.3.0, and now xsde 1.1.0, my way is modifiy the xsd > > library files and the generated parser files. > > I think there is a more elegant way to accomplish this. The idea is > to override the _start_element (and _start_attribute if necessary) > low-level hook on the root parser and call the original version with > a proper namespace when necessary: > > virtual void > _start_element (const xml_schema::ro_string& ns, > const xml_schema::ro_string& name) > { > if (need_to_add_namespace) > { > xml_schema::ro_string ns ("urn:oma:xml:poc:list-service"); > base::_start_element (ns, name); > } > else > { > base::_start_element (ns, name); > } > } > > This will work because all events are going through the root > element parser. This can get a bit more complicated if your > vocabulary mixes qualified and unqualified elements. But > normally it is either all qualified or only root element that > is qualified and both of these cases are easy to handle with > this method. > [Jiang, Bin (Bin)] [Jiang, Bin (Bin)] This may not work if there is more than one namespace definitions in a instance document, right? > > Another question is when there is an exception, like unexpected element > > or attribute, how to get the namespace and name of the encountered > > element or attribute, I only find there is line/column and text(). > > The current error propagation architecture makes it costly to pass > this information up to the caller. Because of that we decided not > to provide it. However, we are planning to change the inner workings > of C++/Parser which will also make it fairly cheap to pass extra > error information around and we will add support for names and > namespaces then. Unfortunately, this is not planned for XSD/e 2.0.0 > (due in a couple of weeks) and will be implemented in XSD/e 2.1.0 > which is scheduled for the end of 2007 - beginning of 2008. How > urgent is this feature for your project? [Jiang, Bin (Bin)] [Jiang, Bin (Bin)] It's not so urgent now, since there is line/column information and we can always locate the error position. But if I can provide name or namespace, the error message would be more user-friendly and complete. Is there any work-around I can use? Performance is not a big issue for me at this moment, since from our performance testing results, xsde1.1.0(with --no-iostream) is good than xsd 2.1.1 which we used before. > > Boris From binjiang at alcatel-lucent.com Mon Oct 29 05:52:32 2007 From: binjiang at alcatel-lucent.com (Jiang, Bin (Bin)) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] Why xml_schema::schema exception is thrown when the xml document is not well formed References: <20071022160431.GB8418@karelia> Message-ID: Hi, Boris and all, I found in some cases, xml_schema::schema will be thrown when the xml document is not well-formed. Take the example library in the xsde release 1.1.0 as an example: book id="MM" available="false" > 0679760806 The Master and Margarita fiction1 ... If I remove the "<" of book, the parser would say: schema error: line[15] column[34] unexpected characters encountered while I think this should be an xml_schema::xml exception. Thanks, Jiang Bin (Bin) GLMS Developer Alcatel-Lucent > -----Original Message----- > From: Jiang, Bin (Bin) > Sent: 2007?10?25? 0:41 > To: 'Boris Kolpackov' > Cc: xsde-users@codesynthesis.com > Subject: RE: [xsde-users] How to config xsde to parse without namespace > check / How to gain a better error message > > Hi Boris, > > Really appreciate your timely and detailed response, thank you! > Please see my more questions below. > > Thanks, > Jiang Bin (Bin) > GLMS Developer > Alcatel-Lucent > > > -----Original Message----- > > From: Boris Kolpackov [mailto:boris@codesynthesis.com] > > Sent: 2007?10?23? 0:05 > > To: Jiang, Bin (Bin) > > Cc: xsde-users@codesynthesis.com > > Subject: Re: [xsde-users] How to config xsde to parse without namespace > > check / How to gain a better error message > > > > Hi Bin, > > > > Jiang, Bin (Bin) writes: > > > > > Sometimes I need parsing a xml fragment without checking the namespace, > > > it there any convenient way to do this? > > > I've used XSD 2.1.1 2.3.0, and now xsde 1.1.0, my way is modifiy the > xsd > > > library files and the generated parser files. > > > > I think there is a more elegant way to accomplish this. The idea is > > to override the _start_element (and _start_attribute if necessary) > > low-level hook on the root parser and call the original version with > > a proper namespace when necessary: > > > > virtual void > > _start_element (const xml_schema::ro_string& ns, > > const xml_schema::ro_string& name) > > { > > if (need_to_add_namespace) > > { > > xml_schema::ro_string ns ("urn:oma:xml:poc:list-service"); > > base::_start_element (ns, name); > > } > > else > > { > > base::_start_element (ns, name); > > } > > } > > > > This will work because all events are going through the root > > element parser. This can get a bit more complicated if your > > vocabulary mixes qualified and unqualified elements. But > > normally it is either all qualified or only root element that > > is qualified and both of these cases are easy to handle with > > this method. > > > > [Jiang, Bin (Bin)] > [Jiang, Bin (Bin)] This may not work if there is more than one namespace > definitions in a instance document, right? > > > > Another question is when there is an exception, like unexpected > element > > > or attribute, how to get the namespace and name of the encountered > > > element or attribute, I only find there is line/column and text(). > > > > The current error propagation architecture makes it costly to pass > > this information up to the caller. Because of that we decided not > > to provide it. However, we are planning to change the inner workings > > of C++/Parser which will also make it fairly cheap to pass extra > > error information around and we will add support for names and > > namespaces then. Unfortunately, this is not planned for XSD/e 2.0.0 > > (due in a couple of weeks) and will be implemented in XSD/e 2.1.0 > > which is scheduled for the end of 2007 - beginning of 2008. How > > urgent is this feature for your project? > > [Jiang, Bin (Bin)] > [Jiang, Bin (Bin)] It's not so urgent now, since there is line/column > information and we can always locate the error position. But if I can > provide name or namespace, the error message would be more user-friendly > and complete. Is there any work-around I can use? Performance is not a big > issue for me at this moment, since from our performance testing results, > xsde1.1.0(with --no-iostream) is good than xsd 2.1.1 which we used before. > > > > > Boris From boris at codesynthesis.com Mon Oct 29 06:06:25 2007 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] Re: Why xml_schema::schema exception is thrown when the xml document is not well formed In-Reply-To: References: <20071022160431.GB8418@karelia> Message-ID: <20071029100625.GE5582@karelia> Hi Bin, Jiang, Bin (Bin) writes: > I found in some cases, xml_schema::schema will be thrown when the > xml document is not well-formed. Take the example library in the > xsde release 1.1.0 as an example: > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://www.codesynthesis.com/library library.xsd"> > > book id="MM" available="false" > > 0679760806 > The Master and Margarita > fiction1 > ... > Actually, this XML is perfectly well-formed. By removing '<' from the book tag, you made it to appear as just a text fragment. Note that you don't have to escape '>' in the element content. > If I remove the "<" of book, the parser would say: > schema error: line[15] column[34] unexpected characters encountered Which is correct. The catalog type specifies that its content should be a sequence of book elements. As a result, when parser encounters text, it reports it as a validation error. I guess this example shows how far a well-formed XML can be from what an application expects and why it is generally a good idea to validate the documents against the vocabulary schema :-). Boris From binjiang at alcatel-lucent.com Mon Oct 29 12:59:19 2007 From: binjiang at alcatel-lucent.com (Jiang, Bin (Bin)) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] RE: Why xml_schema::schema exception is thrown when the xml document is not well formed In-Reply-To: <20071029100625.GE5582@karelia> References: <20071022160431.GB8418@karelia> <20071029100625.GE5582@karelia> Message-ID: Hi Boris, Thank you for the explanation! Please see my other two questions below. Thanks, Jiang Bin (Bin) GLMS Developer Alcatel-Lucent > -----Original Message----- > From: Boris Kolpackov [mailto:boris@codesynthesis.com] > Sent: 2007?10?29? 18:06 > To: Jiang, Bin (Bin) > Cc: xsde-users@codesynthesis.com > Subject: Re: Why xml_schema::schema exception is thrown when the xml > document is not well formed > > Hi Bin, > > Jiang, Bin (Bin) writes: > > > I found in some cases, xml_schema::schema will be thrown when the > > xml document is not well-formed. Take the example library in the > > xsde release 1.1.0 as an example: > > > > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > > xsi:schemaLocation="http://www.codesynthesis.com/library > library.xsd"> > > > > book id="MM" available="false" > > > 0679760806 > > The Master and Margarita > > fiction1 > > ... > > > > Actually, this XML is perfectly well-formed. By removing '<' from > the book tag, you made it to appear as just a text fragment. Note > that you don't have to escape '>' in the element content. > > > If I remove the "<" of book, the parser would say: > > schema error: line[15] column[34] unexpected characters encountered > > Which is correct. The catalog type specifies that its content should > be a sequence of book elements. As a result, when parser encounters > text, it reports it as a validation error. [Jiang, Bin (Bin)] In this case, why the column number is not the beginning of the line, but is 34, the end of the line? When the characters would be ignored by the parser? I add a type like below to the xsd: And find characters inside "identityType" element would be ignored, is this because the "identityType" type is derived from "xsd:anyType"? > > I guess this example shows how far a well-formed XML can be from what > an application expects and why it is generally a good idea to validate > the documents against the vocabulary schema :-). > > Boris From boris at codesynthesis.com Mon Oct 29 13:59:54 2007 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] Re: Why xml_schema::schema exception is thrown when the xml document is not well formed In-Reply-To: References: <20071022160431.GB8418@karelia> <20071029100625.GE5582@karelia> Message-ID: <20071029175954.GB9333@karelia> Hi Bin, Jiang, Bin (Bin) writes: > In this case, why the column number is not the beginning of the > line, but is 34, the end of the line? That's how the underlying XML parser (Expat) works. If you query the column after the event has been triggered, you get the value that points at the end of content that triggered the event. > When the characters would be ignored by the parser? One way to achieve this would be to declare your type as having mixed content (add mixed="true" attribute). The text content will be delivered to the _any_characters() hook which by default does nothing. > > > > > > > > > > > > And find characters inside "identityType" element would be ignored, is > this because the "identityType" type is derived from "xsd:anyType"? Hm, the characters should still be flagged as error. I guess you just found a bug! We will try to fix it for the next release. Boris