From chgans at googlemail.com Wed Nov 2 02:09:06 2016 From: chgans at googlemail.com (Chris Gagneraud) Date: Wed Nov 2 09:21:39 2016 Subject: [xsd-users] Enumerated strings parser generation Message-ID: Hi there, My XML schema has a lot (and a lot) of enumerated string types, when using the cxx-tree generator everything is easy, generated type parsers include the enum declaration and ad-hoc xml parsers, but when using the cxx-parser, nothing is done at all, i end up having to write the boiler plate. I understand the difference b/w cxx-tree and cxx-parser, but in this case, is it possible to hit a middle ground? For example it would be nice to have all the enum literals arrays generated (and why not the enum definition too, that i could potentially use as-is or modify, a bit like the 'dummy' impl using eg, --generate-print-impl). Given the amount of enum i have to deal with, I will have to write my own XSD enum to C++ enum helper generator. Which sound completely weird given the purpose of xsd-cxx! I haven't try to hack into xsd-cxx source code, and i would prefer to avoid having to deal with a customised tool. Would anyone know a way to achieve this? Any point-out appreciated. Thanks, Chris From boris at codesynthesis.com Wed Nov 2 11:38:56 2016 From: boris at codesynthesis.com (Boris Kolpackov) Date: Wed Nov 2 11:39:06 2016 Subject: [xsd-users] Enumerated strings parser generation In-Reply-To: References: Message-ID: Hi Chris, Chris Gagneraud writes: > I understand the difference b/w cxx-tree and cxx-parser, but in this > case, is it possible to hit a middle ground? The intended use of C++/Parser is (a) simple XML vocabularies and (b) existing object models to parse them into. Your case doesn't fit either of the two. Why not use C++/Tree? Boris From chgans at googlemail.com Wed Nov 2 17:52:55 2016 From: chgans at googlemail.com (Chris Gagneraud) Date: Thu Nov 3 11:19:09 2016 Subject: [xsd-users] Enumerated strings parser generation In-Reply-To: References: Message-ID: On 3 November 2016 at 04:38, Boris Kolpackov wrote: > Hi Chris, > > Chris Gagneraud writes: > >> I understand the difference b/w cxx-tree and cxx-parser, but in this >> case, is it possible to hit a middle ground? > Hi Boris, > The intended use of C++/Parser is (a) simple XML vocabularies and (b) > existing object models to parse them into. Your case doesn't fit either > of the two. Why not use C++/Tree? C++/Tree does in-memory parsing of the whole document, which is a no-go for me as some XML file can be really huge (go some segfault on Windows due to that), plus I liked the idea of having my own object model. Although right now my object model mimic very closely the XML structure, I'm planning to change that partially once i'm getting accustomed with the domain model. To get started, I might switch back to C++/Tree and limit myself to "not-too-big" files, and then implement a second stage "translator" that would convert the XSD/CXX structures to the a more domain-friendly model... After re-reading the introductory paragraph of http://www.codesynthesis.com/products/xsd/c++/parser/, it's not obvious at all that the intended use is for simple vocabulary, the text emphasized on memory usage and general performance. Well, a definite hint is the type-map, which in my case would get quite big (haven't started yet, but just looking at my object model, it's clear that i will have to provide a lot of hints in that file) To give you a bit more context, another problem I have to face is that existing software that generates these XML files are not always XSD compliant, not by much but still. My work around so far was to modify the XSD to relax it a bit (based on parsing failures as they come), and then implement a second pass to detect minor compliance errors and issue warnings/errors. So to summarise my requirements: - 1. Big XSD (100+ enums, 100's of complex types, XSD/Parser generates 80k SLOCs of skeleton code) - 2. Can handle big/huge XML files (250+ MB) - 3. Can cope with minor XSD compliance issues - 4. Ideally Qt friendly (QString, QList, QVariant, QByteArray, ...) - 5. Support for partial parsing (I will want at some point to do partial parsing, as the XML document contain huge amount of data for the full spectrum of use cases, and the end user might want to use the XML file for just a sub-domain.) '1' is the reason I turned to XSD/CXX instead of my own parser (was using Qt/XML, too much work, and so too much risk of buggy implementation) '2' is a definite blocker, as of 2016 250MB XML files is a "big guy", but the trend is for even bigger files as technology keep evolving. '3' as already stated could be implemented by carefull XSD modification, maybe I could catch exception too, but don't really like it, have to think again about that one. '4' is not a big deal really, just a nice-to-have '5' My understading is that it is easily doable with XSD/CXX Maybe I should give a go at XSD/e? Thanks for writing this great piece of SW and for making it available to the open-source community. Chris > > Boris From chgans at googlemail.com Thu Nov 3 00:40:14 2016 From: chgans at googlemail.com (Chris Gagneraud) Date: Thu Nov 3 11:19:09 2016 Subject: [xsd-users] A9 regression Message-ID: Hi there, Using xsd-cxx version A9 on Windows, I have some issues with the error handlers, exceptions and schema validation. In a nutshell, Here is my main: ----------------------------------------------------------------- class ErrorHandler: public xml_schema::ErrorHandler { public: virtual bool handle (...) { cout << flush; cerr << endl << "Line " << line << ", Column " << column << ": " << message << endl << flush; return false; } }; int main(int argc, char *argv[]) { try { xml_schema::Properties props; // FIXME: Use absolute path, otherwise the file path have to be relative to the // path of the file we are parsing, not the current dir. props.schema_location ("http://webstds.ipc.org/2581", "IPC-2581B_V3.0.xsd"); props.schema_location ("http://www.w3.org/XML/1998/namespace", "xml.xsd"); cout << "Parsing " << argv[1] << "..."; ErrorHandler errorHandler; unique_ptr ipc(iPC_2581(argv[1], errorHandler, 0, props)); cout << " Done!" << endl; } catch (const xsd::cxx::tree::parsing &e) { cout << flush; cerr << endl << "XSD Parsing error:\n" << e << endl << flush; return 1; } catch (const xsd::cxx::exception &e) { cout << flush; cerr << endl << "XSD Exception:\n" << e.what() << endl << flush; } return 0; } ----------------------------------------------------------------- Case 1: The schemas and the xml file are all in the current folder, when i execute my program on a non-compliant XML file I get: ----------------------------------------------------------------- Parsing samples\test-3_r2.xml... XSD Exception: expected element not encountered ----------------------------------------------------------------- Case 2: Same XML file, but schema not found, I get ----------------------------------------------------------------- Parsing samples\test-3_r2.xml... Line 0, Column 0: unable to open primary document entity 'E:\projects\ipc-parser/samples\nowhere\IPC-2581B_V3.0.xsd' [!crash!] The program has unexpectedly finished. ----------------------------------------------------------------- Case 3: Now, still with the wrong schema path, but this time returning true in my handler: ----------------------------------------------------------------- Parsing samples\test-3_r2.xml... Line 0, Column 0: unable to open primary document entity 'E:\projects\ipc-parser/samples\nowhere\IPC-2581B_V3.0.xsd' Line 2, Column 159: no declaration found for element 'IPC-2581' Line 2, Column 159: attribute 'revision' is not declared for element 'IPC-2581' XSD Parsing error: instance document parsing failed ----------------------------------------------------------------- Case 4: The schemas and the xml file are all in the current folder, non compliant XML file, and the handler returns true ----------------------------------------------------------------- Parsing samples\test-3_r2.xml... XSD Exception: expected element not encountered ----------------------------------------------------------------- Case 5: as Case 1 and 4 (schema found) but without error handler at all ----------------------------------------------------------------- Parsing samples\test-3_r2.xml... XSD Exception: expected element not encountered ----------------------------------------------------------------- Case 6: as Case 2 and 3 (schema not found) but without error handler at all Parsing samples\test-3_r2.xml... ----------------------------------------------------------------- XSD Parsing error: :0:0 warning: unable to open primary document entity 'E:\projects\ipc-parser/samples\/nowhere/IPC-2581B_V3.0.xsd' E:\projects\ipc-parser/samples\test-3_r2.xml:2:159 error: no declaration found for element 'IPC-2581' E:\projects\ipc-parser/samples\test-3_r2.xml:2:159 error: attribute 'revision' is not declared for element 'IPC-2581' ----------------------------------------------------------------- So my question is how am i suppose to catch non-compliance error with the error context (parsing context)? Shouldn't my error handler be called for any kind of error? As far as i remember my code worked correctly with v4.0.0 (I had to use a strip-down version of the schema due to a bug in xsd-cxx). My point is that i was able to report non-compliance error with context information (line/column numbers). Is this a regression or am I missing something? Thanks, Chris From boris at codesynthesis.com Fri Nov 4 09:39:36 2016 From: boris at codesynthesis.com (Boris Kolpackov) Date: Fri Nov 4 09:39:45 2016 Subject: [xsd-users] Enumerated strings parser generation In-Reply-To: References: Message-ID: Hi Chris, Chris Gagneraud writes: > C++/Tree does in-memory parsing of the whole document, which is a > no-go for me as some XML file can be really huge (go some segfault on > Windows due to that), You can do partially in-memory/partially streaming processing with C++/Tree. See the 'streaming' example. > ... plus I liked the idea of having my own object [...] implement a > second stage "translator" that would convert the XSD/CXX structures > to the a more domain-friendly model... Probably the most reasonable approach for non-trivial schemas. > To give you a bit more context, another problem I have to face is that > existing software that generates these XML files are not always XSD > compliant, not by much but still. My work around so far was to modify > the XSD to relax it a bit [...] Either that or disable validation and then "fix-up" the XML at the DOM level before handing it off to C++/Tree. There is no magic solution for this problem. > Maybe I should give a go at XSD/e? You can try, though again for complex schemas it will be harder to use than C++/Tree. > Thanks for writing this great piece of SW and for making it available > to the open-source community. You are welcome. Hope you can make it work for your case. Boris From boris at codesynthesis.com Fri Nov 4 09:52:51 2016 From: boris at codesynthesis.com (Boris Kolpackov) Date: Fri Nov 4 09:53:00 2016 Subject: [xsd-users] A9 regression In-Reply-To: References: Message-ID: <20161104135251.GA2949@codesynthesis.com> Hi Chris, Chris Gagneraud writes: > Using xsd-cxx version A9 on Windows, I have some issues with the error > handlers, exceptions and schema validation. While there is a lot of information in your email, I am having a hard time following what you have done and what exactly does not work. Can you therefore do the following: 1. Read through Section 3.3, "Error Handling": http://codesynthesis.com/projects/xsd/documentation/cxx/tree/manual/#3.3 2. Adjust the hello example to match your application. 3. Reproduce the problem using this modified hello example. 4. Send the modified hello example along with the steps to reproduce the problem and what the problem is (e.g., expected the callback to be called but it was not, etc). Boris From chgans at googlemail.com Sun Nov 6 19:37:48 2016 From: chgans at googlemail.com (Chris Gagneraud) Date: Mon Nov 7 10:54:39 2016 Subject: [xsd-users] A9 regression In-Reply-To: <20161104135251.GA2949@codesynthesis.com> References: <20161104135251.GA2949@codesynthesis.com> Message-ID: On 5 November 2016 at 02:52, Boris Kolpackov wrote: > Hi Chris, > > Chris Gagneraud writes: > >> Using xsd-cxx version A9 on Windows, I have some issues with the error >> handlers, exceptions and schema validation. > > While there is a lot of information in your email, I am having a hard > time following what you have done and what exactly does not work. > Can you therefore do the following: > > 1. Read through Section 3.3, "Error Handling": > > http://codesynthesis.com/projects/xsd/documentation/cxx/tree/manual/#3.3 > > 2. Adjust the hello example to match your application. Unfortunately I cannot reproduce! :( So let me reformulate my question: How is it possible that with this simple code: try { xml_schema::Properties props; props.schema_location ("http://webstds.ipc.org/2581", "IPC-2581B_V3.0.xsd"); unique_ptr ipc(iPC_2581(argv[1], 0, props)); } catch (const xml_schema::Exception& e) { cerr << e << endl; } While parsing a possibly non-compliant XML file, I have this error message: expected element 'http://webstds.ipc.org/2581#StandardPrimitive' Instead of the usual one with line/column numbers, eg: test-3_r2.xml:1234:12: expected element 'http://webstds.ipc.org/2581#StandardPrimitive' Without knowing the line number, it's nearly impossible for me to track down what the problem is and how to fix it. Chris > > 3. Reproduce the problem using this modified hello example. > > 4. Send the modified hello example along with the steps to reproduce the > problem and what the problem is (e.g., expected the callback to be > called but it was not, etc). > > Boris From chgans at googlemail.com Sun Nov 6 21:02:18 2016 From: chgans at googlemail.com (Chris Gagneraud) Date: Mon Nov 7 10:54:39 2016 Subject: [xsd-users] A9 regression In-Reply-To: References: <20161104135251.GA2949@codesynthesis.com> Message-ID: On 7 November 2016 at 13:37, Chris Gagneraud wrote: > On 5 November 2016 at 02:52, Boris Kolpackov wrote: >> Hi Chris, >> >> Chris Gagneraud writes: >> >>> Using xsd-cxx version A9 on Windows, I have some issues with the error >>> handlers, exceptions and schema validation. >> >> While there is a lot of information in your email, I am having a hard >> time following what you have done and what exactly does not work. >> Can you therefore do the following: >> >> 1. Read through Section 3.3, "Error Handling": >> >> http://codesynthesis.com/projects/xsd/documentation/cxx/tree/manual/#3.3 >> >> 2. Adjust the hello example to match your application. > > Unfortunately I cannot reproduce! :( > > So let me reformulate my question: > > How is it possible that with this simple code: > try > { > xml_schema::Properties props; > props.schema_location ("http://webstds.ipc.org/2581", "IPC-2581B_V3.0.xsd"); > unique_ptr ipc(iPC_2581(argv[1], 0, props)); > } > catch (const xml_schema::Exception& e) > { > cerr << e << endl; > } > > While parsing a possibly non-compliant XML file, I have this error message: > expected element 'http://webstds.ipc.org/2581#StandardPrimitive' > > Instead of the usual one with line/column numbers, eg: > test-3_r2.xml:1234:12: expected element > 'http://webstds.ipc.org/2581#StandardPrimitive' > > Without knowing the line number, it's nearly impossible for me to > track down what the problem is and how to fix it. OK, answering myself, there's no line number, because this is thrown by the generated code, * after the DOM parsing *, at which point the input context is gone... Now, the error i get is not due to the input file being non-conformant, but by the generated code that doesn't seem to cope with substitution groups. My XSD makes heavy use of substitution groups, with abstract elements, eg: + |-+ | |-- | |-+ | |-- | |-- More concrete UserPrimitive sub-types | | |-+ |-- |-+ |-- |-- More concrete StandardPrimitive sub-types | For some reason the generated code expect an element named "StandardPrimitive" instead of checking for all the allowed substitute names... The code was generated using xsd cxx-tree \ --root-element IPC-2581 \ --generate-polymorphic \ --type-naming ucc \ --function-naming lcc \ --namespace-map http://webstds.ipc.org/2581=ipc2581 \ --std c++11 \ --file-per-type --output-dir xsdtree \ $XSD.xsd I have tried adding several '--polymorphic-type ' but it didn't make any difference. It seems to me that xsdcxx doesn't support "pure abstract" element (element with "abstract='true'" but with no type), Chris > > Chris > > >> >> 3. Reproduce the problem using this modified hello example. >> >> 4. Send the modified hello example along with the steps to reproduce the >> problem and what the problem is (e.g., expected the callback to be >> called but it was not, etc). >> >> Boris From boris at codesynthesis.com Tue Nov 8 10:12:47 2016 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Nov 8 10:12:58 2016 Subject: [xsd-users] A9 regression In-Reply-To: References: <20161104135251.GA2949@codesynthesis.com> Message-ID: Hi Chris, Chris Gagneraud writes: > It seems to me that xsdcxx doesn't support "pure abstract" element > (element with "abstract='true'" but with no type), An element without the 'type' attribute defaults to xsd:anyType. Try adding: --polymorphic-type anyType If that doesn't help, you can try: --polymorphic-type-all If that works then it is a good idea to figure out which additional types you need to list with --polymorphic-type (*-all is an overkill). XSD should warn about all such cases. Boris