I wrote a JSON library
Sean Kelly
sean at invisibleduck.org
Tue May 7 13:50:28 PDT 2013
On Tuesday, 7 May 2013 at 20:14:20 UTC, w0rp wrote:
> On Tuesday, 7 May 2013 at 18:36:20 UTC, Sean Kelly wrote:
>
>> $ main
>> n = 1
>> Milliseconds to call stdJson() n times: 73054
>> Milliseconds to call newJson() n times: 44022
>> Milliseconds to call jepJson() n times: 839
>> newJson() is faster than stdJson() 1.66x times
>> jepJson() is faster than stdJson() 87.1x times
>
> This is very interesting. This jepJson library seems to be
> pretty fast. I imagine this library works very similar to SAX,
> so you can save quite a bit on simply not having to allocate.
Yes, the jep parser does no allocation at all--all callbacks
simply receive a slice of the value. It does full validation
according to the spec, but there's no interpretation of the
values beyond that either, so if you want the integer string you
were passed converted to an int, for example, you'd do the
conversion yourself. The same goes for unescaping of string
data, and in practice I often end up unescaping the strings
in-place since I typically never need to re-parse the input
buffer.
In practice, it's kind of a pain to use the jep parser for
arbitrary processing so I have some functions layered on top of
it that iterate across array values and object keys:
int foreachArrayElem(char[] buf, scope int delegate(char[]
value));
int foreachObjectField(char[] buf, scope int delegate(char[]
name, char[] value));
This works basically the same as opApply, so having the delegate
return a nonzero value causes parsing to abort and return that
value from the foreach routine. The parser is sufficiently fast
that I generally just nest calls to these foreach routines to
parse complex types, even though this results in multiple passes
across the same data.
The only other thing I was careful to do is design the library in
such a way that each parser callback could call a corresponding
writer routine to simply pass through the input to an output
buffer. This makes auto-reformatting a breeze because you just
set a "format output" flag on the writer and implement a few
one-line functions.
> Before I read this, I went about creating my own benchmark.
> Here is a .zip containing the source and some nice looking bar
> charts comparing std.json, vibe.d's json library, and my own
> against various arrays of objects held in memory as a string:
>
> http://www.mediafire.com/download.php?gabsvk8ta711q4u
>
> For those less interested in downloading and looking at the
> .ods file, here are the results for the largest input size.
> (Array of 100,000 small objects)
>
> std.json - 2689375370 ms
> vibe.data.json - 2835431576 ms
> dson - 3705095251 ms
These results don't seem correct. Is this really milliseconds?
More information about the Digitalmars-d
mailing list