[xsd-users] Binary serialization performance in version 4.0.0 vs 3.3.0

svetlana.samsonik at thomsonreuters.com svetlana.samsonik at thomsonreuters.com
Mon Oct 20 17:59:25 EDT 2014


Hi Boris,



I ran the tests in the 'performance' example and received very similar results.

I my own test are slightly different. I changed the example code to

mimic the my logic and I received similar results to yours for parsing and serialization.



When I run the those tests for "real" schema (which is complicated) and file size 25K, I receive the following results



                                      version 3.3.0   version 4.0.0 (build with --std c++11)

                                      microseconds    microseconds



Parse XML to C++/Tree object:         4719            5420

Serialize C++/Tree object to XML:     5303            5735



Serialize C++/Tree object to XRD:     299             329



Copy from XDR to C++/Tree object:     639             1405



Binary representation size:           32688           30108





Overall the results do show some time increase for version 4.0 for parsing/serialization and binary serialization to XDR.



My biggest concern is "Copy from XDR to C++/Tree object" where it is 4 times longer to compare with "Serialize C++/Tree object to XRD" in version 4.0, where it is only 2 times in version 3.3.0.



I ran valgrind --tool=callgrind for both test and you can see that the same function (loadFromXDR) has much more "Ir per call" for version 4.0 (first picture) vs version 3.3 ( second picture).



Do you know what can cause such behavior?



Thank you,

Svetlana





[cid:part1.00030600.01040202 at comcast.net]



[cid:part2.08030903.08030108 at comcast.net]





-----Original Message-----
From: Boris Kolpackov [mailto:boris at codesynthesis.com]
Sent: Wednesday, October 15, 2014 11:32 PM
To: Samsonik, Svetlana (TR Technology)
Cc: xsd-users at codesynthesis.com
Subject: Re: [xsd-users] Binary serialization performance in version 4.0.0 vs 3.3.0



Hi Svetlana,



svetlana.samsonik at thomsonreuters.com<mailto:svetlana.samsonik at thomsonreuters.com> <svetlana.samsonik at thomsonreuters.com<mailto:svetlana.samsonik at thomsonreuters.com>> writes:



> After upgrading to version 4.0 there is a performance difference  for

> parsing and binary serialization (XDR).

>

> [...]

>

> Is the difference expected?



We check for parsing/serialization performance regressions before every release and there weren't any between 3.3.0 and 4.0.0. We haven't tested binary serialization but I wouldn't expect any difference there since it stayed relatively unchanged.



There is the 'performance' example that measures the parsing and serialization performance. Here are the numbers I get for a 50k file:



./driver-3.3 test-50k.xml

parsing:

  document size:  51324 bytes

  iterations:     1000

  time:           3.628948000 sec

  throughput:     275.562 documents/sec

  throughput:     13.4878 MBytes/sec

serialization:

  document size:  51324 bytes

  iterations:     1000

  time:           4.977265000 sec

  throughput:     200.914 documents/sec

  throughput:     9.83399 MBytes/sec



./driver-4.0 test-50k.xml

parsing:

  document size:  51324 bytes

  iterations:     1000

  time:           3.686281000 sec

  throughput:     271.276 documents/sec

  throughput:     13.278 MBytes/sec

serialization:

  document size:  51324 bytes

  iterations:     1000

  time:           5.215050000 sec

  throughput:     191.753 documents/sec

  throughput:     9.3856 MBytes/sec



For 500k file:



./driver-3.3 test-500k.xml

parsing:

  document size:  512146 bytes

  iterations:     1000

  time:           38.327591000 sec

  throughput:     26.0909 documents/sec

  throughput:     12.7433 MBytes/sec

serialization:

  document size:  512146 bytes

  iterations:     1000

  time:           51.091402000 sec

  throughput:     19.5728 documents/sec

  throughput:     9.55974 MBytes/sec



./driver-4.0 test-500k.xml

parsing:

  document size:  512146 bytes

  iterations:     1000

  time:           39.594913000 sec

  throughput:     25.2558 documents/sec

  throughput:     12.3354 MBytes/sec

serialization:

 document size:  512146 bytes

  iterations:     1000

  time:           50.190411000 sec

  throughput:     19.9241 documents/sec

  throughput:     9.73135 MBytes/sec



This is on my Linux box compiled with GCC/-O3 and using Xerces-C++ 3.1.1.



Could you run this test in your configuration and see if you get a similar picture?



Boris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 602723 bytes
Desc: image001.png
Url : http://codesynthesis.com/pipermail/xsd-users/attachments/20141020/5531bee9/image001-0001.png
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 587989 bytes
Desc: image002.png
Url : http://codesynthesis.com/pipermail/xsd-users/attachments/20141020/5531bee9/image002-0001.png


More information about the xsd-users mailing list