[xsde-users] Parsing a TCP stream

Jonathan Haws Jonathan.Haws at sdl.usu.edu
Thu Jan 16 08:30:30 EST 2014


> > I am looking for some tools that will parse a TCP stream into full XML 
> > documents. I have searched online and found that some use the 
> > boost::asio libraries.
> > 
> > I thought would ask here what the recommended tools are.
>
> I don't think there could be a single "recommended" way to do this. It all depends on your requirements. For some application using raw sockets could be a perfectly reasonable approach. > While other may requires something as involved as ASIO. Generally, I think it has much more to do with the kind of communication you will be doing: multiple sockets or single socket, > single/multithread-handling of connections, blocking/non-blocking/asynchronous I/O, etc.
>
> As a generally guide, I would suggest that you keep it simple, at least at the beginning (e.g., blocking raw socket).

My setup will be a single thread managing a single connection, most likely blocking until it receives data.  The problem I run up against is the fact that a single read may or may not get the whole document.  Does that narrow things down any on various approaches?  

My experience with TCP communication has always included some form of formatted header that gives me all the information I need to either read more data or process it.  Unfortunately XML does not have that -- all it has is a closing tag for the root element, but since my schema has multiple options for root elements, some of which can also be children, that complicates this significantly to simply check for a closing tag at the end of the buffer...

> > Bottom line is I am receiving XML documents from a TCP socket and need 
> > to know where one document ends and another begins. Is there a way to 
> > automatically do this using things built into XSD/e?
>
> No, there is no automatic (more like auto-magical) way.

Dang, I was hoping there was some way that I could just feed it a stringstream or something similar and just have it parse to the end of the root element.

> > If not, what is the best way to go about doing that?
>
> In XML, NULL ('\0', U+0000) is an invalid character (both in XML 1.0 and 1.1) which makes it a convenient document delimiter.

That seems like a nice way to handle it, however the tools I have to verify my code against the defined schema (provided to me by the customer) doesn't send a NULL character, so that tells me that it isn't an options to base my parsing off that.  Too bad...



More information about the xsde-users mailing list