From mkelkar at rocketmail.com  Wed Jan  2 15:32:51 2008
From: mkelkar at rocketmail.com (Mahesh Kelkar)
Date: Tue Jul  1 03:37:21 2008
Subject: [xsde-users] Question on the XSD/e performance
Message-ID: <889271.50664.qm@web30711.mail.mud.yahoo.com>

Is performance of the XSD/e limited by the underlying parser performance? Or as a part of C++ code generation do you modify any parser code to streamline the code execution?

Do you have any plans to improve the performance of the overall package by adopting recommendations of the XML-scremer?

Thanks
Mahesh


      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs
From boris at codesynthesis.com  Thu Jan  3 08:00:32 2008
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Tue Jul  1 03:37:21 2008
Subject: [xsde-users] Question on the XSD/e performance
In-Reply-To: <889271.50664.qm@web30711.mail.mud.yahoo.com>
References: <889271.50664.qm@web30711.mail.mud.yahoo.com>
Message-ID: <20080103130032.GB10674@karelia>

Hi Mahesh,

Mahesh Kelkar <mkelkar@rocketmail.com> writes:

> Is performance of the XSD/e limited by the underlying parser performance?

Yes, it is currently limited by the underlying (non-validating) XML parser
performance.


> Or as a part of C++ code generation do you modify any parser code to
> streamline the code execution?

The data extraction, XML Schema validation, and dispatching code
is generated based on a particular schema which results in better
performance compared to the general-purpose validating XML parsers.
There is not really much to be gained from generating custom, low-
level XML parser (also see notes below).


> Do you have any plans to improve the performance of the overall
> package by adopting recommendations of the XML-scremer?

We do not have any immediate plans to do so. As mentioned in
the paper written based on this research project, many of the
optimization techniques employed by XML Screamer are only
applicable when certain assumptions are made about XML documents
being processed. In other words these techniques only work on a
subset of XML and are not possible (or hard to implement) in the
general case.

Having said that, if your XML documents are not using certain
expensive (from the parsing performance point of view) features
(e.g., entity references), then it may be possible to come up
with a custom underlying XML parser for this XML subset with
a much better performance compared to general-purpose XML
parsers that must be able to parse all conforming XML documents.

Another approach that proved to work very well for some of our
clients is to use binary instead of textual representation of
XML. For example, you could use a binary format for internal
interchange/storage in your application and normal text XML for
interoperability with outside world. XSD can produce code from
XML Schema for parsing/serializing in several binary formats
with typical parsing speeds being about 10 times faster than
text XML parsing.

Boris