parseJSON bug

Johannes Pfau nospam at example.com
Thu Aug 8 15:19:22 PDT 2013


Am Thu, 08 Aug 2013 22:15:28 +0200
schrieb "Tofu Ninja" <emmons0 at purdue.edu>:

> On Thursday, 8 August 2013 at 18:31:52 UTC, David wrote:
> > Am 08.08.2013 20:24, schrieb Adam D. Ruppe:
> >> On Thursday, 8 August 2013 at 17:33:38 UTC, David wrote:
> >>> I made a pull request improving the API a few weeks ago,
> >>> no one seems to really care.
> >> 
> >> Phobos needs a new dictator.
> >
> > Either that, or I will soon start my own standard lib and stop 
> > caring
> > about phobos. I think, I even know a few who would help me...
> 
> It is really bad that people are actually talking about starting 
> there own standard lib, I wasn't around for the whole phobos vs 
> tango thing but from what I hear, it wasn't pretty. If there is 
> problems with phobos or the way its managed, I feel like we 
> should try and fix them and not try to replace it.

I'm sorry that you've had such bad experiences when contributing to
phobos. I think this is less of a phobos than a std.json problem. I'd
like to explain the special issue we have with std.json - as far as I
understand it:

std.json is a very old module: 
* It's API is not up to current phobos standards
* there are some - probably serious - bugs
* no interaction with ranges
* doesn't use @safe const, nothrow IIRC
* inherently unsafe API (accessing the union part)
* could be (should be) much faster
* the orignal author is no longer around. AFAICS there's nobody feeling
  responsible for this module.

The last point is probably the biggest problem. Most people here would
probably like to see a complete replacement - std.json2. Some of us
have argued that we should remove std.json (and std.xml) from phobos
ASAP even if there's no replacement as this code is really not what we
want in the standard lib.

Others are strongly against any API breakage (especially without
replacement) and therefore std.json is still there. But this makes
things even more difficult: Improving std.json means we'd have both API
breakage and a sub-optimal std.json. I guess nobody wants to merge
changes to std.json because 1) they feel std.json has to be replaced
anyway and improving the old design is wasting time 2) they don't want
to be blamed for any possible API breakage.

There were some discussions how a std.json replacement should look
like, I'll try to reiterate the main points:

=== Input API: ===
1) A pull-parser/tokenizer/lexer
   * Should implement a InputRange API with ElementType JSONToken.
   * Should be as fast as possible
   * shouldn't allocate memory
   * shouldn't even convert strings to numbers. Instead JSONToken
     should have a 'type' field (ObjectStart, ArrayStart, String,
     Number, ...) and a value field which is always a (raw!) w/d/string.
     A templated T get!(JSONType) helper which could also verify the
     type would be useful. Any decoding of JSON strings should be done
     in the get function. Get may optionally use some caching.
   * It should be specialized for input arrays
     (const/immutable/mutable) of char/wchar/dchar which are completely
     in-memory and use slicing in these cases.
   * It should work with arbitrary ranges of w/d/char. To avoid memory
     allocation a fixed-length internal buffer should be used and token
     values should be slices to that buffer. (It would be good if the
     user can query whether values are slices of the original input
     which can be reused, or sliced of the iternal buffer which have to
     be .dup ed to keep them)

2) A DOM-API
   * Should be based on 1)
   * Same principle as current std.json, but in modern D
   * API similar to std.variant, DYAML

optional, nice-to have:

3) A Sax-style API (based on 1)

4) A simple deserialization API
   * basically T fromJSON(T, CHAR)(CHAR input) (supporting all
     inputs supported by 1)
   * Shouldn't allocate any memory
   * Optionally skipping fields which are in the JSON data but not part
     of T and the other way round.
   * Usage: auto artist = fromJSON(Artist, `{"name": "", "songs":42}´);


=== Output API: ===
1) A OutputRange of JSONTokens
   * Shouldn't allocate
   * Should work on top of other OutputRanges
   * should accept the tokens from the pull-parser, so we can do
     copy(JSONTokenizer(json), JSONWriter(stdout));
   * Additionally provide an easier user API: JSONWriter.startObject(),
     JSONWriter.startArray(), JSONWriter.writeField(name, T),...

2) DOM API
   * It should of course be possible to manipulate the DOM, then write
     it back
   * Should be the same as the input DOM API
   * Should be based on 2)

optional, nice-to have:
3) A simple serialization API
   * counterpart to the deserialization API
   * Shouldn't allocate any memory
   * Based on output-ranges and 1)
   * Usage: myvariable.toJSON(stdout);
   * Helper function using Appender: string json = myvariable.toJSON();

Another questions is if and how this should interact with streams, if
we ever get them.

This list is quite long. As you see the demands are quite high and
therefore it will be a difficult and time-consuming task to write a
std.json2 which pleases everyone. Of course it's possible we're
overthinking this.

I don't want to make any particular point here whether std.json should
be removed, partially updated & fixed or simply made deprecated and
left alone. I'm just trying to give some background information on this
issue as far as I perceive it.

BTW: Because of these issues there already are alternative libraries
implementing JSON stuff. Apart from vibe.d there's also ae.util.json
and some more. Fortunately as long as nobody decides to fork druntime
there's no real problem. Other languages (C/C++) don't have JSON in the
standard library either. But still please don't let this turn you away
from phobos in general - improvements to other modules are usually
handled in a much better way.



More information about the Digitalmars-d mailing list