Performance of std.json

w0rp via Digitalmars-d digitalmars-d at puremagic.com
Mon Jun 2 04:36:49 PDT 2014


On Monday, 2 June 2014 at 00:39:48 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
> It's my understanding that the current design of std.json is 
> considered
> to be poor, but I don't haven't used it, so I don't know any the
> details. But if it's as slow as you're finding to be the case, 
> then I
> think that that supports the idea that it needs a redesign. The
> question then is what a new std.json should look like and who 
> would do
> it. And that pretty much comes down to an interested and 
> motivated
> developer coming up with and implementing a new design and then
> proposing it here. And until someone takes up that torch, we'll 
> be
> stuck with what we have. Certainly, there's no fundamental 
> reason why
> we can't have a lightening fast std.json. With ranges and 
> slices,
> parsing in D in general should be faster than C/C++ (and 
> definitely
> faster than Haskell of python), and if it isn't, that indicates 
> that
> the implementation (if not the whole design) of that code needs 
> to be
> redone.
>
> I know that vibe.d uses its own json implementation, but I 
> don't know
> how much of that is part of its public API and how much of that 
> is
> simply used internally: http://vibed.org
>
> - Jonathan M Davis

I implemented a JSON library myself which parses JSON and 
generates JSON objects similar to how std.json does not. I wrote 
it largely because of the poor API in the standard library at the 
time, but I think by this point nearly all of the concerns have 
been alleviated.

At the time I benchmarked it against std.json and vibe.d's 
implementation, and they were all pretty equivalent in terms of 
performance. I settled for edging just slightly ahead of 
std.json. If there's any major performance gains to make, I 
believe we will have to completely rethink how we go about 
parsing JSON I suspect transparent character encoding and 
decoding (dchar ranges) might be one potential source of trouble.

In terms of API, I wouldn't go completely for an approach based 
on serialising to structs. Having a tagged union type is still 
helpful for situations where you just want to quickly get at some 
JSON data and do something with it. I have thought a great deal 
about writing data *to* JSON strings however, and I have an idea 
for this I would like to share.

First, you define by convention that there is a function 
writeJSON which takes some value and an OutputRange, and then 
writes the value in a JSON representation directly to an 
OutputRange. You define in the library writeJSON functions for 
standard types.

writeJSON(OutputRange)(JSONValue, OutputRange);
writeJSON(OutputRange)(string, OutputRange);
writeJSON(OutputRange)(int, OutputRange);
writeJSON(OutputRange)(bool, OutputRange);
writeJSON(OutputRange)(typeof(null), OutputRange);
// ...

You define one additional writeJSON function, which takes any 
InputRange of type T and writes an array of Ts. (So string[] will 
write an array of strings, int[] will write ints, etc.)

writeJSON(InputRange, OutputRange)(InputRange inRange, 
OutputRange outRange) {
    foreach(ref value; inRange) {
        writeJSON(value, outRange);
    }
}

Add a convenience method which takes var args alternatively 
string, T, string, U, ... Call it say, writeJSONObject.

You now have a decent framework for writing objects directly to 
OutputRanges.

struct Foo {
     AnotherType bar;
     string stringValue;
     int intValue;
}

writeJSON(OutputRange)(Foo foo, OutputRange outRange) {
     // Writes {"bar":<bar_value>, ... }
     writeJSONObject(outRange,
          // writeJSONObject calls writeJSON for AnotherType, etc.
         "bar", foo.bar,
         "stringValue", foo.stringValue,
         "intValue", foo.intValue
     );
}

There are more details, and something would need to be done for 
handling stack overflows, (inlining?) but there's the idea that I 
had for improving writing JSON at least. One advantage in this 
approach would be that it wouldn't be dependent on the GC, and 
scoped buffers could be used. (A @nogc candidate, I think.) You 
can't get this ability out of something like toJSON which 
produces a string at once.


More information about the Digitalmars-d mailing list