[xsd-users] Using XSD to validate and process documents without
namespaces specified in top level elements
Karl Mutch
karlmutchlists at gmail.com
Thu Aug 21 00:58:53 EDT 2008
HI,
I have an issue where I am trying to parse documents that are coming from a
third party. These have none of the normal namespace attributes and I would
like to parseinput sources in such a way as they can be validated using my
own xsd files.
I have tried a large number of approaches and am now running, or rather
failing, with something similar to the following :
namespace
{
std::string xml_get_next_lane_response_schema_data("<?xml
version=\"1.0\" encoding=\"utf-8\"?>\n\
<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" xmlns:pp=\"
http://www.enterprise.com/GetNextLaneResponse\" xmlns=\"
http://www.enterprise.com/GetNextLaneResponse\" targetNamespace=\"
http://www.enterprise.com/GetNextLaneResponse\"
elementFormDefault=\"unqualified\" attributeFormDefault=\"unqualified\">\n\
<xs:element name=\"request\">\n\
<xs:complexType>\n\
<xs:sequence maxOccurs=\"1\">\n\
<xs:element name=\"message\" minOccurs=\"1\"
maxOccurs=\"1\">\n\
<xs:complexType>\n\
<xs:simpleContent>\n\
<xs:extension base=\"xs:string\">\n\
<xs:attribute name=\"status\"
type=\"xs:string\" use=\"required\" />\n\
</xs:extension>\n\
</xs:simpleContent>\n\
</xs:complexType>\n\
</xs:element>\n\
<xs:element name=\"data\" minOccurs=\"0\"
maxOccurs=\"1\">\n\
<xs:complexType>\n\
<xs:sequence>\n\
<xs:element name=\"Lane\"
type=\"xs:unsignedInt\" minOccurs=\"1\" maxOccurs=\"1\" />\n\
</xs:sequence>\n\
</xs:complexType>\n\
</xs:element>\n\
</xs:sequence>\n\
<xs:attribute name=\"task\" type=\"xs:string\" use=\"required\"
/>\n\
</xs:complexType>\n\
</xs:element>\n\
</xs:schema>\n");
xsd::cxx::xml::string
inputGrammar(xml_get_next_lane_response_schema_data);
bool
Initialize()
{
// For performance reasons, we would like to initialize/terminate
// Xerces-C++ ourselves once instead of letting API functions do
// it potentially continously during processing.
//
xercesc::XMLPlatformUtils::Initialize ();
return(true);
}
/* USED */
bool fInitialized = Initialize();
using namespace enterprise;
class LocalResolver : public xercesc::DOMEntityResolver
{
public:
xercesc::DOMInputSource *resolveEntity(const XMLCh* const
publicId,
const XMLCh* const systemId,
const XMLCh* const baseURI)
{
return(new xercesc::Wrapper4InputSource (new
xercesc::MemBufInputSource(reinterpret_cast<unsigned char const
*>(xml_get_next_lane_response_schema_data.c_str ()),
xml_get_next_lane_response_schema_data.size (),
"GetNextLaneResponse.xsd")));
}
};
// Throws exceptions that are expected to be handled by callers !
std::auto_ptr<enterprise::subsystem::GetNextLaneResponse::request>
ParseDocument(std::istream &inputStream)
{
using namespace xercesc;
namespace xml = xsd::cxx::xml;
namespace tree = xsd::cxx::tree;
const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};
// Get an implementation of the Load-Store (LS) interface.
//
DOMImplementation* impl (
DOMImplementationRegistry::getDOMImplementation (ls_id));
// Create a DOMBuilder.
//
xml::dom::auto_ptr<DOMBuilder> parser (
impl->createDOMBuilder(DOMImplementationLS::MODE_SYNCHRONOUS, 0));
// Discard comment nodes in the document.
//
parser->setFeature (XMLUni::fgDOMComments, false);
// Enable datatype normalization.
//
parser->setFeature (XMLUni::fgDOMDatatypeNormalization, true);
// Do not create EntityReference nodes in the DOM tree. No
// EntityReference nodes will be created, only the nodes
// corresponding to their fully expanded substitution text
// will be created.
//
parser->setFeature (XMLUni::fgDOMEntities, false);
// Perform namespace processing.
//
parser->setFeature (XMLUni::fgDOMNamespaces, true);
// Do not include ignorable whitespace in the DOM tree.
//
parser->setFeature (XMLUni::fgDOMWhitespaceInElementContent, false);
// Enable validation.
//
parser->setFeature (XMLUni::fgDOMValidation, true);
parser->setFeature (XMLUni::fgXercesSchema, true);
parser->setFeature (XMLUni::fgXercesSchemaFullChecking, false);
parser->setProperty
(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,
const_cast<void*> (
static_cast<const void*> (
xml::string ("GetNextLaneResponse.xsd").c_str ())));
parser->setProperty (XMLUni::fgXercesSchemaExternalSchemaLocation,
const_cast<void*> (
static_cast<const void*> (
xml::string
("http://www.enterprise.com/GetNextLaneResponseGetNextLaneResponse.xsd").c_str
())));
// Initialize the schema cache.
// virtual Grammar* loadGrammar(const DOMInputSource& source,
const short grammarType, const bool toCache = false) = 0;
//std::istringstream
input_grammar(xml_get_next_lane_response_schema_data);
//xml::sax::std_input_source grammar_wrapper(input_grammar);
//xml::string inputGrammar(xml_get_next_lane_response_schema_data);
xercesc::MemBufInputSource grammar_wrapper
(reinterpret_cast<unsigned char const
*>(xml_get_next_lane_response_schema_data.c_str ()),
xml_get_next_lane_response_schema_data.size (), "GetNextLaneResponse.xsd");
xercesc::Wrapper4InputSource grammar_input_wrapper
(&grammar_wrapper, false);
parser->loadGrammar (grammar_input_wrapper,
Grammar::SchemaGrammarType, true);
parser->setFeature (XMLUni::fgXercesUseCachedGrammarInParse, true);
parser->set
// We will release the DOM document ourselves.
//
parser->setFeature (XMLUni::fgXercesUserAdoptsDOMDocument, true);
// Set error handler.
//
tree::error_handler<char> eh;
xml::dom::bits::error_handler_proxy<char> ehp (eh);
parser->setErrorHandler (&ehp);
// Set the entity resolver
LocalResolver localResolver;
parser->setEntityResolver(&localResolver);
// Wrap the standard input stream.
//
xml::sax::std_input_source isrc(inputStream,
"GetNextLaneResponse.xsd");
Wrapper4InputSource wrap (&isrc, false);
wrap.setSystemId(xml::transcode_to_xmlch("GetNextLaneResponse.xsd"));
// Parse XML to DOM.
//
xml::dom::auto_ptr<xercesc_2_8::DOMDocument> doc (parser->parse
(wrap));
eh.throw_if_failed<tree::parsing<char> > ();
xml_schema::properties properties;
properties.schema_location("
http://www.enterprise.com/GetNextLaneResponse", "GetNextLaneResponse.xsd");
properties.no_namespace_schema_location("GetNextLaneResponse.xsd");
// Parse DOM to the object model.
//
return(std::auto_ptr<enterprise::subsystem::GetNextLaneResponse::request>
(enterprise::subsystem::GetNextLaneResponse::request_ (
*doc, xml_schema::flags::keep_dom |
xml_schema::flags::own_dom, properties)));
} // end of ... ParseDocument(std::istream &inputStream)
};
more code and then I push the following through the parser :
<?xml version="1.0" encoding="utf-8"?>
<request task="Monitor">
<message status="12">A Message</message>
</linx>
It bails with the following
Schema in GetNextLaneResponse.xsd has a different target namespace from the
one specified in the instance document ."
Obviously because I cannot force the target namespace. So I turned off NS
Processing using
parser->setFeature (XMLUni::fgDOMNamespaces, false);
And get an "Unknown element" error for the request tag.
If I absolutely know the schema that would work is there a way I can cause
this to work and resolve my elements without mangling the input document ?
Thanks
karl
P.S. I have read the following and don't seem to get any joy from them.
http://www.codesynthesis.com/pipermail/xsd-users/2007-February/000796.html
http://www.codesynthesis.com/pipermail/xsd-users/2006-September/000535.html
My intent would be to use the code from
http://wiki.codesynthesis.com/Tree/FAQ#How_do_I_specify_a_schema_location_other_than_in_an_XML_document.3Fto
identify the correct schema in time.
More information about the xsd-users
mailing list