I wrote a JSON library

Tue May 7 13:50:28 PDT 2013

On Tuesday, 7 May 2013 at 20:14:20 UTC, w0rp wrote:
> On Tuesday, 7 May 2013 at 18:36:20 UTC, Sean Kelly wrote:
>
>> $ main
>> n = 1
>> Milliseconds to call stdJson() n times: 73054
>> Milliseconds to call newJson() n times: 44022
>> Milliseconds to call jepJson() n times: 839
>> newJson() is faster than stdJson() 1.66x times
>> jepJson() is faster than stdJson() 87.1x times
>
> This is very interesting. This jepJson library seems to be 
> pretty fast. I imagine this library works very similar to SAX, 
> so you can save quite a bit on simply not having to allocate.

Yes, the jep parser does no allocation at all--all callbacks 
simply receive a slice of the value.  It does full validation 
according to the spec, but there's no interpretation of the 
values beyond that either, so if you want the integer string you 
were passed converted to an int, for example, you'd do the 
conversion yourself.  The same goes for unescaping of string 
data, and in practice I often end up unescaping the strings 
in-place since I typically never need to re-parse the input 
buffer.

In practice, it's kind of a pain to use the jep parser for 
arbitrary processing so I have some functions layered on top of 
it that iterate across array values and object keys:

int foreachArrayElem(char[] buf, scope int delegate(char[] 
value));
int foreachObjectField(char[] buf, scope int delegate(char[] 
name, char[] value));

This works basically the same as opApply, so having the delegate 
return a nonzero value causes parsing to abort and return that 
value from the foreach routine.  The parser is sufficiently fast 
that I generally just nest calls to these foreach routines to 
parse complex types, even though this results in multiple passes 
across the same data.

The only other thing I was careful to do is design the library in 
such a way that each parser callback could call a corresponding 
writer routine to simply pass through the input to an output 
buffer.  This makes auto-reformatting a breeze because you just 
set a "format output" flag on the writer and implement a few 
one-line functions.

> Before I read this, I went about creating my own benchmark. 
> Here is a .zip containing the source and some nice looking bar 
> charts comparing std.json, vibe.d's json library, and my own 
> against various arrays of objects held in memory as a string:
>
> http://www.mediafire.com/download.php?gabsvk8ta711q4u
>
> For those less interested in downloading and looking at the 
> .ods file, here are the results for the largest input size. 
> (Array of 100,000 small objects)
>
> std.json - 2689375370 ms
> vibe.data.json - 2835431576 ms
> dson - 3705095251 ms

These results don't seem correct.  Is this really milliseconds?