DCT: D compiler as a collection of libraries

Jacob Carlborg doob at me.com
Fri May 11 02:08:24 PDT 2012


On 2012-05-11 10:58, Roman D. Boiko wrote:
> On Friday, 11 May 2012 at 08:38:36 UTC, Jacob Carlborg wrote:
>> (Re-posting here)
>> A couple of questions:
>>
>> * What's the sate of the lexer
> I consider it a draft state, because it has got several rewrites
> recently and I plan to do more, especially based on community
> feedback. However, implementation handles almost all possible
> cases. Because of rewrites it is most likely broken at this
> moment, I'm going to fix it ASAP (in a day or two).

I see.

> Lexer will provide a random-access range of tokens (this is not
> done yet).

Ok.

> Each token contains:
> * start index (position in the original encoding, 0 corresponds
> to the first code unit after BOM),
> * token value encoded as UTF-8 string,
> * token kind (e.g., token.kind = TokenKind.Float),
> * possibly enum with annotations (e.g., token.annotations =
> FloatAnnotation.Hex | FloatAnnotation.Real)

What about line and column information?

>> * Does it convert numerical literals and similar to their actual values
> It is planned to add a post-processor for that as part of parser,
> please see README.md for some more details.

Isn't that a job for the lexer?

>> * Does it retain full source information
> Yes, this is a design choice to preserve all information. Source
> code is converted to UTF-8 and stored as token.value, even
> whitespaces. Information about code unit indices in the original
> encoding is preserved, too.

That's sounds good.

>> * Is there an example we can look at to see how the API is used
> TBD soon (see Roadmap in the readme file)
>
>> * Does it have a range based interface
> Yes, this is what I consider one of its strengths.

I see. Thanks.

-- 
/Jacob Carlborg


More information about the Digitalmars-d-announce mailing list