CodeSynthesis XSD 4.0.0 Released

CodeSynthesis XSD 4.0.0 was released today.

In case you are not familiar with XSD, it is an open-source, cross-platform XML Schema to C++ data binding compiler. Provided with an XML instance specification (XML Schema), it generates C++ classes that represent the given vocabulary as well as XML parsing and serialization code. You can then access the data stored in XML using types and functions that semantically correspond to your application domain rather than dealing with the intricacies of reading and writing raw XML.

For an exhaustive list of new features see the official announcement. Below I am going to cover the notable new features in more detail and include some insight into what motivated their addition.

Ok, that was a major new release. So what are the major changes and new features? Well, firstly, we removed quite a bit of “outdated backwards-compatibility baggage”, such as support for Xerces-C++ 2-series (2.8.0) or Visual Studio 2003 (7.1). At the same time, the good news is there aren’t any changes that will break existing code. What has changed a lot are the compiler internals, and, especially, dependencies which will make building XSD from source much easier.

While removing old stuff we also added support for new C++ compilers that popped up since the last release. XSD now supports Clang as well as Visual Studio 2012 (11.0) and 2013 (12.0).

Ok, let’s now examine the major new features. The biggest is support for C++11 (the --std c++11 option). While there are many little changes in the generated code when this mode is enabled, the major two are the reliance on the move semantics and the use of std::unique_ptr instead of deprecated std::auto_ptr.

Another big feature in this release is support for ordered types. XSD flattens nested XML Schema compositors to give us a nice and simple C++ API. This works very well in most cases, especially for more complex schemas. Sometimes, however, this can lead to the loss of relative element ordering that can be semantically important to the application (the “unordered choice” XML Schema idiom). Now you can mark such types as ordered which makes XSD generate an additional order tracking API. So now you can have the best of both worlds: nice and simple API in most cases and additional order information in a few places where the simple API is not enough.

Once we had this implemented, another stubbornly annoying feature, mixed content, got sorted out. The problem with mixed content is that the text fragments can appear interleaved with elements in pretty much any order. Extracting the text is easy, it is preserving the order information relative to the elements, that’s the tricky part. But now we have the perfect mechanism for that. One user who was beta-testing this feature said: “I read the new documentation and I’m impressed.”

You can read more on ordered types in Section 2.8.4, “Element Order” and on mixed content in Section 2.13, “Mapping for Mixed Content Models” in the C++/Tree Mapping User Manual.

Another problem that is somewhat similar to mixed content is access to data represented by xs:anyType and xs:anySimpleType XML Schema types. anyType allows any content in any order. You can think of its definition as a complex type with mixed content that has an element wildcard that allows any elements and an attribute wildcard that allows any attributes. In other words, anything goes. XSD already can represent wildcard content as raw DOM fragments so it was only natural to extend this support to anyType content. Similar to anyType, anySimpleType allows any simple content, that is, any text (pretty similar to xs:string in that sense). Now it is possible to get anySimpleType content as a text string.

For more information on this new feature see Section 2.5.2, “Mapping for anyType” and Section 2.5.3, “Mapping for anySimpleType” in the C++/Tree Mapping User Manual.

Another cool feature in XSD is the stream-oriented, partially in-memory XML processing that allows parsing and serialization of XML documents in chunks. This allows us to process parts of the document as they become available as well as handle documents that are too large to fit into memory. XSD comes with an example, called streaming, that shows how to set all this up. In this release this example has been significantly improved. It now has much better XML namespace handling and allows streaming at multiple document levels. This turned out to be really useful for handling large and complex documents such as GML/CityGML.

Last but not least, those of us who still prefer to write our own makefiles will be happy to know XSD now supports automatic make-style dependency generation, similar to the GCC’s -M* functionality but just with sane option names. See the XSD Compiler Command Line Manual (man pages) for details.

Comments are closed.