std.d.lexer : voting thread

Sun Oct 6 09:30:06 PDT 2013

On 06/10/13 18:07, Andrei Alexandrescu wrote:
> I'm working on related code, and got all the way there in one day (Friday) with
> a C++ tokenizer for linting purposes (doesn't open #includes or expand #defines
> etc; it wasn't meant to).
>
> The core generated fragment that does the matching is at https://dpaste.de/GZY3.
>
> The surrounding switch statement (also in library code) handles whitespace and
> line counting. The client code needs to handle by hand things like parsing
> numbers (note how the matcher stops upon the first digit), identifiers, comments
> (matcher stops upon detecting "//" or "/*") etc. Such things can be achieved
> with hand-written code (as I do), other similar tokenizers, DFAs, etc. The point
> is that the core loop that looks at every character looking for a lexeme is fast.

What I'm getting at is that I'd be prepared to give a vote "no to std, yes to 
etc" for Brian's d.lexer, _if_ I was reasonably certain that we'd see an 
alternative lexer module submitted to Phobos within the next month :-)