Request for comments: std.d.lexer

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Feb 5 19:51:32 PST 2013


On 2/5/13 10:29 PM, Jonathan M Davis wrote:
> On Tuesday, February 05, 2013 08:34:29 Andrei Alexandrescu wrote:
>> As far as I could tell the dependencies of the lexer are fairly
>> contained (util, token, identifier) and conversion to input range is
>> immediate.
>
> I don't remember all of the details at the moment, since it's been several
> months since I looked at dmd's lexer, but a lot of the problem stems from the
> fact that it's all written around the assumption that it's dealing with a
> char*. Converting it to operate on string might be fairly straightforward, but
> it gets more complicated when dealing with ranges. Also, both Walter and
> others have stated that the lexer in D should be configurable in a number of
> ways, and dmd's lexer isn't configurable at all. So, while a direct translation
> would likely be quick, refactoring it to do what it's been asked to be able to
> do would not be.
>
> I'm quite a ways along with one that's written from scratch, but I need to find
> the time to finish it. Also, doing it from scratch has had the added benefit of
> helping me find bugs in the spec and in dmd.

I think it would be reasonable for a lexer to require a range of ubyte 
as input, and carry its own decoding. In the first approximation it may 
even require a random-access range of ubyte.

Andrei




More information about the Digitalmars-d mailing list