[xsde-users] Parsing a TCP stream

Fri Jan 17 13:37:46 EST 2014

Hi Jonathan,

Jonathan Haws <Jonathan.Haws at sdl.usu.edu> writes:

> My setup will be a single thread managing a single connection, most likely
> blocking until it receives data. The problem I run up against is the fact
> that a single read may or may not get the whole document. Does that narrow
> things down any on various approaches?

Yes, you will need to somehow separate documents. That's why I suggested
using the NULL byte as a document separator. This way you will know where
one document ends and the next begins.

> Dang, I was hoping there was some way that I could just feed it a
> stringstream or something similar and just have it parse to the end
> of the root element.

XSD/e parser (and all other XML parser that I have heard of) operate
in terms of XML documents, not "XML documents and some extra stuff
that should be ignored". In fact, if you try to parse such a buffer
with XSD/e, you will get an error saying that there is garbage after
the closing tag (which is illegal per the XML spec).

> That seems like a nice way to handle it, however the tools I have to verify
> my code against the defined schema (provided to me by the customer) doesn't
> send a NULL character, so that tells me that it isn't an options to base my
> parsing off that.  Too bad...

I am not sure we are on the same page here, so just to clarify: the
NULL character is not part of the XML vocabulary or schema that
describes it. Rather, it is a protocol-level delimiter that is used
to determine where one document ends and the next begins. If you have
control over the protocol, then it is very easy to implement. If the
protocol is defined by someone else, then you will have to implement
some ad-hoc (and most likely not very robust) approach of determining
where the document ends.

Boris