String Literal Docs

Alix Pexton alix.DOT.pexton at gmail.DOT.com
Mon Jun 21 12:21:25 PDT 2010


On 20/06/2010 22:46, Alix Pexton wrote:
> On 20/06/2010 21:37, Ellery Newcomer wrote:
>> On 06/20/2010 03:01 PM, Alix Pexton wrote:
>>> On 19/06/2010 21:12, Alix Pexton wrote:
>>>> I've been sketching some grammar diagrams for D2.0, a little like those
>>>> on JSON.org, and of course I didn't get far before I ran into something
>>>> odd.
>>>>
>>>
>>> I think I will take the plunge and base my diagrams on the source of
>>> DMD. After looking at the code in lexer.c, it does not seem as far
>>> beyond my rusty old c++ parsing skills as I had expected! Massive credit
>>> to Walter for having a codebase that is as mature as DMD without it
>>> turning into a labyrinth of preprocessor macros and cryptic "comefrom"s.
>>>
>>> This will mean however that my little project may take a little longer,
>>> sigh...
>>>
>>> A...
>>
>> Do share. I've always been too lazy to read lexer.c, and from this
>> discussion, it sounds like there are a few spots where my own lexer
>> grammar is incorrect (or at least differs from dmd).
>>
>
> of course ^^
>
> A...

Well, I think I have got my head around lexer.c now, and its various 
peculiarities, like "000377." being a valid float (although not 
according to my shiny new, limited edition copy of tDPL (fig2.2 p35)^^).

The weirdness occurs because some of some corner cases are handled not 
by the neat little state state machine that validates reals, but in the 
scanner at the point where it recognises a number beginning with a zero. 
The productions in lex.html represent the range of inputs that are 
accepted by the state machine without taking into account that the 
scanner rejects the sequence "._" (which makes sense as that is the 
identifier "_" in the outer scope).

Andrei's analysis in tDPL also points out that 0xp0 is a valid hexfloat, 
but a strict reading of lex.html would not allow it.

Overall the diagram for hexfloat is much simpler than the one for 
decimalfloat, which I think will have to be split into 3 ><

A...

PS, octal must die!


More information about the Digitalmars-d mailing list