std.data.json formal review
Sönke Ludwig via Digitalmars-d
digitalmars-d at puremagic.com
Sat Aug 15 03:18:24 PDT 2015
Am 14.08.2015 um 10:17 schrieb Walter Bright:
> On 8/13/2015 11:52 PM, Sönke Ludwig wrote:
>> Am 14.08.2015 um 02:26 schrieb Walter Bright:
>>> On 8/13/2015 3:51 AM, Sönke Ludwig wrote:
>>>> These were, AFAICS, the only major open issues (a decision for an
>>>> opt() variant
>>>> would be nice, but fortunately that's not a fundamental decision in
>>>> any way).
>>>
>>> 1. What about the issue of having the API be a composable range
>>> interface?
>>>
>>> http://s-ludwig.github.io/std_data_json/stdx/data/json/lexer/lexJSON.html
>>>
>>>
>>> I.e. the input range should be the FIRST argument, not the last.
>>
>> Hm, it *is* the first function argument, just the last template argument.
>
> Ok, my mistake. I didn't look at the others.
>
> I don't know what 'isStringInputRange' is. Whatever it is, it should be
> a 'range of char'.
I'll rename it to isCharInputRange. We don't have something like that in
Phobos, right?
>>> 2. Why are integers acceptable as lexer input? The spec specifies
>>> Unicode.
>> In this case, the lexer will perform on-the-fly UTF validation of the
>> input. It
>> can do so more efficiently than first validating the input using a
>> wrapper
>> range, because it has to check the value of most incoming code units
>> anyway.
>
> There is no reason to validate UTF-8 input. The only place where
> non-ASCII code units can even legally appear is inside strings, and
> there they can just be copied verbatim while looking for the end of the
> string.
The idea is to assume that any char based input is already valid UTF (as
D defines it), while integer based input comes from an unverified
source, so that it still has to be validated before being cast/copied
into a 'string'. I think this is a sensible approach, both semantically
and performance-wise.
>
>
>>> 3. Why are there 4 functions that do the same thing?
>>>
>>> http://s-ludwig.github.io/std_data_json/stdx/data/json/generator.html
>>>
>>> After all, there already is a
>>> http://s-ludwig.github.io/std_data_json/stdx/data/json/generator/GeneratorOptions.html
>>>
>> There are two classes of functions that are not covered by
>> GeneratorOptions:
>> writing to a stream or returning a string.
>
> Why do both? Always return an input range. If the user wants a string,
> he can pipe the input range to a string generator, such as .array
Convenience for one. The lack of number to input range conversion
functions is another concern. I'm not really keen to implement an input
range style floating-point to string conversion routine just for this
module.
Finally, I'm a little worried about performance. The output range based
approach can keep a lot of state implicitly using the program counter
register. But an input range would explicitly have to keep track of the
current JSON element, as well as the current character/state within that
element (and possibly one level deeper, for example for escape
sequences). This means that it will require either multiple branches or
indirection for each popFront().
More information about the Digitalmars-d
mailing list