[xsd-users] Re: In-memory validation

david.r.moss at selex-comms.com david.r.moss at selex-comms.com
Wed Jan 23 03:46:03 EST 2008


Boris,

Thanks for that, it re-enforces some of the issues we've thought of here! 
I need to give it some thought - I will get back to you!

Cheers,
Dave.

Dave Moss 
SELEX Communications 
Grange Road 
Christchurch 
Dorset  BH23 4JE 
United Kingdom 
Tel: + 44 (0) 1202 404841 
Email: david.r.moss at selex-comms.com 




Boris Kolpackov <boris at codesynthesis.com> 
Sent by: xsd-users-bounces at codesynthesis.com
01/21/08 07:39 AM

To
david.r.moss at selex-comms.com
cc
xsd-users at codesynthesis.com
Subject
[xsd-users] Re: In-memory validation






Hi David,

After some more thinking and experimentation on the subject of in-
memory validation I would like to get your thoughts on our current
view of how it can be implemented and whether it will still be
useful for you project.

You initial use-case (quoted below) calls for what I would call
"immediate detection" of validation errors. Based on this notion
of immediate detection, we can place every XML Schema validation
construct (e.g., maxOccurs/minOccurs, ordering of elements, facets,
uniqueness, etc.) into one of the three categories:

1. Desirable and possible/efficient to implement

2. Desirable and impossible/inefficient to implement

3. Undesirable

Into the "desirable and possible/efficient" category fall, for
example, the enforcement of xsd:ID uniqueness constraint as well
as maxOccurs and the maxLength list facet.

The "desirable and impossible/inefficient" category contains the
bulk of the XML Schema validation constructs. These include most
of the facets and key/keyref/unique constructs. Let's consider
the minInclusive and maxInclusive facets from your example below.
The range checking code will have to be called after every
modification to the underlying int value. In the current
architecture you can do, for example, the following:

int& i = rt->bounded_int(); // Get a reference to the "base" type (int)
i = 100; // Impossible to detect.

This is an example of a check that is impossible to implement in the
current C++/Tree architecture. A check that is possible but inefficient
to implement is, for example, the pattern facet on xsd:string. The
pattern checking code (most likely a virtual function with quite an
expensive body) will have to be called every time a modification is
made to any single character in the underlying string.

Then there is a number of undesirable checks that, if enforced
immediately, would make the object model very awkward to use.
These are minOccurs, the length and minLength list facets, ordering
of elements, as well as compound keys in key/unique. The problem
with all these constraints is that you may need to perform several
operations (e.g., several push_back's for minOccurs and element
ordering or modification of several elements/attributes for
compound key/unique) before the resulting object model becomes
valid.

Based on these considerations, it appears that while the "immediate
detection" model may look appealing on the surface, in reality it is
either not practical or undesirable for the majority of XML Schema
constructs.

This brings us to the next option: "on-demand detection". The idea
is that each generated class will have the validate() function which
can be called to detect validation errors on a fragment of object
model:

rt->bounded_int( 50 );
rt->restricted_string( "abc" );
rt->validate (); // Exception is thrown if invalid.

There is, however, a number of questions about practical usefulness
and implementation of this model:

1. How to point to the error location? Possible options: (1) a
   reference to the invalid node passed as xml_schema::type&
   (drawback: hard to know the actual type and thus to do
   anything about the error), (2) XPath identifying the error
   location (drawback: impossible to use to correct the error).

2. Some errors may be impossible for the application to correct.
   For example, if an error indicates that a string does not match
   a pattern, what is the application going to do?

3. If error correction by the application is hard/impossible then
   what is the use of in-memory validation other then to know
   whether the object model is valid/invalid?


I would therefore appreciate any feedback on these concerns as
well as on what people expect from the in-memory validation
facility in their applications.

Thanks,
Boris


david.r.moss at selex-comms.com <david.r.moss at selex-comms.com> writes:

> A while ago there was talk of in-memory validation as outlined in an
> example below.
>
> Example code:
>
> // Load will fail when content is invalid.
> auto_ptr<memory_test_t> rt( root( "in-memory-validation-test.xml" ) );
>
> // Modify with a valid value (range is 20-80 inclusive).
> rt->bounded_int( 50 );
> cout << *rt << endl;
>
> rt->bounded_int( 100 ); // Ideally, this should fail (throw exception - 
as
> initial file load would)
> cout << *rt << endl;
>
>
> I believe this kind of capability was on your 'to do' list at one point;
> is this still the case and, if so, do you have an idea of time-scales?!
>
> Cheers,
> Dave.




------------------------------------------------------------
This email and any attached files contains company confidential information which may be legally privileged. It is intended only for the person(s) or entity to which it is addressed and solely for the purposes set forth therein.  If you are not the intended recipient or have received this email in error please notify the sender by return, delete it from your system and destroy any local copies.  It is strictly forbidden to use the information in this email including any attachment or part thereof including copying, disclosing, distributing, amending or using for any other purpose.

In addition the sender excludes all liabilities (whether tortious or common law) for damage or breach arising or related to this email including but not limited to viruses and libel.
SELEX Communications Limited is a Private Limited Company registered in England and Wales under Company Number 964533 and whose Registered Office is Marconi House, New Street, CHELMSFORD, Essex. CM1 1PL. England.



More information about the xsd-users mailing list