Performance of std.json
w0rp via Digitalmars-d
digitalmars-d at puremagic.com
Mon Jun 2 04:36:49 PDT 2014
On Monday, 2 June 2014 at 00:39:48 UTC, Jonathan M Davis via
Digitalmars-d wrote:
> It's my understanding that the current design of std.json is
> considered
> to be poor, but I don't haven't used it, so I don't know any the
> details. But if it's as slow as you're finding to be the case,
> then I
> think that that supports the idea that it needs a redesign. The
> question then is what a new std.json should look like and who
> would do
> it. And that pretty much comes down to an interested and
> motivated
> developer coming up with and implementing a new design and then
> proposing it here. And until someone takes up that torch, we'll
> be
> stuck with what we have. Certainly, there's no fundamental
> reason why
> we can't have a lightening fast std.json. With ranges and
> slices,
> parsing in D in general should be faster than C/C++ (and
> definitely
> faster than Haskell of python), and if it isn't, that indicates
> that
> the implementation (if not the whole design) of that code needs
> to be
> redone.
>
> I know that vibe.d uses its own json implementation, but I
> don't know
> how much of that is part of its public API and how much of that
> is
> simply used internally: http://vibed.org
>
> - Jonathan M Davis
I implemented a JSON library myself which parses JSON and
generates JSON objects similar to how std.json does not. I wrote
it largely because of the poor API in the standard library at the
time, but I think by this point nearly all of the concerns have
been alleviated.
At the time I benchmarked it against std.json and vibe.d's
implementation, and they were all pretty equivalent in terms of
performance. I settled for edging just slightly ahead of
std.json. If there's any major performance gains to make, I
believe we will have to completely rethink how we go about
parsing JSON I suspect transparent character encoding and
decoding (dchar ranges) might be one potential source of trouble.
In terms of API, I wouldn't go completely for an approach based
on serialising to structs. Having a tagged union type is still
helpful for situations where you just want to quickly get at some
JSON data and do something with it. I have thought a great deal
about writing data *to* JSON strings however, and I have an idea
for this I would like to share.
First, you define by convention that there is a function
writeJSON which takes some value and an OutputRange, and then
writes the value in a JSON representation directly to an
OutputRange. You define in the library writeJSON functions for
standard types.
writeJSON(OutputRange)(JSONValue, OutputRange);
writeJSON(OutputRange)(string, OutputRange);
writeJSON(OutputRange)(int, OutputRange);
writeJSON(OutputRange)(bool, OutputRange);
writeJSON(OutputRange)(typeof(null), OutputRange);
// ...
You define one additional writeJSON function, which takes any
InputRange of type T and writes an array of Ts. (So string[] will
write an array of strings, int[] will write ints, etc.)
writeJSON(InputRange, OutputRange)(InputRange inRange,
OutputRange outRange) {
foreach(ref value; inRange) {
writeJSON(value, outRange);
}
}
Add a convenience method which takes var args alternatively
string, T, string, U, ... Call it say, writeJSONObject.
You now have a decent framework for writing objects directly to
OutputRanges.
struct Foo {
AnotherType bar;
string stringValue;
int intValue;
}
writeJSON(OutputRange)(Foo foo, OutputRange outRange) {
// Writes {"bar":<bar_value>, ... }
writeJSONObject(outRange,
// writeJSONObject calls writeJSON for AnotherType, etc.
"bar", foo.bar,
"stringValue", foo.stringValue,
"intValue", foo.intValue
);
}
There are more details, and something would need to be done for
handling stack overflows, (inlining?) but there's the idea that I
had for improving writing JSON at least. One advantage in this
approach would be that it wouldn't be dependent on the GC, and
scoped buffers could be used. (A @nogc candidate, I think.) You
can't get this ability out of something like toJSON which
produces a string at once.
More information about the Digitalmars-d
mailing list