XSD: XML Data Binding for C++

CodeSynthesis XSD is an open-source, cross-platform W3C XML Schema to C++ data binding compiler. Provided with an XML document specification (XML Schema), it generates C++ classes that represent the given vocabulary as well as XML parsing and serialization code. You can then access the data stored in XML using types and functions that semantically correspond to your application domain rather than dealing with generic elements/attributes and raw strings:

unique_ptr<Contact> c = contact ("c.xml");
cout << c->name () << ", "
     << c->email () << ", "
     << c->phone () << endl;
<contact>
  <name>John Doe</name>
  <email>j@doe.com</email>
  <phone>555 12345</phone>
</contact>

The process of extracting the data from a direct representation of XML (such as DOM or SAX) and presenting it as a hierarchy of objects or events that correspond to a document vocabulary is called XML Data Binding. An XML Data Binding compiler accomplishes this by establishing a mapping or binding between XML Schema and a target programming language. For more information on why use XML Data Binding and CodeSynthesis XSD, see Reasons to Use.

XSD supports two XML Schema to C++ mappings: in-memory C++/Tree and stream-oriented C++/Parser. The C++/Tree mapping represents the information stored in XML documents as a tree-like, in-memory object model. C++/Parser is a new, SAX-like mapping which represents the data stored in XML as a hierarchy of vocabulary-specific parsing events. The following table summarizes key advantages of the two C++ bindings:

C++/Tree C++/Parser
Ready to use type system for in-memory representation Fit into existing type system by constructing your own in-memory representation
Complete XML document view and referential integrity Perform immediate processing as parts of the document become available (streaming)
Optional association with underlying DOM nodes Handle XML documents that are too large to fit into memory
Additional features: serialization back to DOM or XML, support for ID/IDREF cross-referencing, etc. Small footprint, including code size and runtime memory consumption

Compared to APIs such as DOM and SAX, XML data binding allows you to access the data in XML documents using your domain vocabulary instead of generic elements, attributes, and text. Static typing helps catch errors at compile-time rather than at run-time. Automatic code generation saves time and minimizes the effort needed to adapt your applications to changes in the document structure.

The following two examples show the amount and complexity of code needed to access the information in the above XML using generic C++ APIs compared to C++ bindings:

// DOM

DOMElement* c = ...
DOMNodeList* l;

l = c->getElementsByTagName ("name");
DOMNode* name = l->item (0);

l = c->getElementsByTagName ("email");
DOMNode* email = l->item (0);

l = c->getElementsByTagName ("phone");
DOMNode* phone = l->item (0);

cout << name->getTextContent () << ", "
     << email->getTextContent () << ", "
     << phone->getTextContent () << endl;
// XML Binding: C++/Tree

Contact c = ...











cout << c.name () << ", "
     << c.email () << ", "
     << c.phone () << endl;


// SAX

class ContactParser: ...
{
  virtual void
  endElement (const string& name)
  {
    if (name == "name")
      cout << ", "
    else if (name == "email")
      cout << ", "
    else if (name == "phone")
      cout << endl;
  }

  virtual void
  characters (const string& s)
  {
    cout << s;
  }
};
// XML Binding: C++/Parser

class ContactParser: ...
{
  virtual void
  name (const string& n)
  {
    cout << n << ", ";
  }

  virtual void
  email (const string& e)
  {
    cout << e << ", ";
  }

  virtual void
  phone (const string& p)
  {
    cout << p << endl;
  }
};

Features