From mkelkar at rocketmail.com Wed Jan 2 15:32:51 2008 From: mkelkar at rocketmail.com (Mahesh Kelkar) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] Question on the XSD/e performance Message-ID: <889271.50664.qm@web30711.mail.mud.yahoo.com> Is performance of the XSD/e limited by the underlying parser performance? Or as a part of C++ code generation do you modify any parser code to streamline the code execution? Do you have any plans to improve the performance of the overall package by adopting recommendations of the XML-scremer? Thanks Mahesh ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From boris at codesynthesis.com Thu Jan 3 08:00:32 2008 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Jul 1 03:37:21 2008 Subject: [xsde-users] Question on the XSD/e performance In-Reply-To: <889271.50664.qm@web30711.mail.mud.yahoo.com> References: <889271.50664.qm@web30711.mail.mud.yahoo.com> Message-ID: <20080103130032.GB10674@karelia> Hi Mahesh, Mahesh Kelkar writes: > Is performance of the XSD/e limited by the underlying parser performance? Yes, it is currently limited by the underlying (non-validating) XML parser performance. > Or as a part of C++ code generation do you modify any parser code to > streamline the code execution? The data extraction, XML Schema validation, and dispatching code is generated based on a particular schema which results in better performance compared to the general-purpose validating XML parsers. There is not really much to be gained from generating custom, low- level XML parser (also see notes below). > Do you have any plans to improve the performance of the overall > package by adopting recommendations of the XML-scremer? We do not have any immediate plans to do so. As mentioned in the paper written based on this research project, many of the optimization techniques employed by XML Screamer are only applicable when certain assumptions are made about XML documents being processed. In other words these techniques only work on a subset of XML and are not possible (or hard to implement) in the general case. Having said that, if your XML documents are not using certain expensive (from the parsing performance point of view) features (e.g., entity references), then it may be possible to come up with a custom underlying XML parser for this XML subset with a much better performance compared to general-purpose XML parsers that must be able to parse all conforming XML documents. Another approach that proved to work very well for some of our clients is to use binary instead of textual representation of XML. For example, you could use a binary format for internal interchange/storage in your application and normal text XML for interoperability with outside world. XSD can produce code from XML Schema for parsing/serializing in several binary formats with typical parsing speeds being about 10 times faster than text XML parsing. Boris