std.d.lexer : voting thread

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Sun Oct 6 09:07:25 PDT 2013


On 10/6/13 5:40 AM, Joseph Rushton Wakeling wrote:
> How quickly do you think this vision could be realized? If soon, I'd say
> it's worth delaying a decision on the current proposed lexer, if not ...
> well, jam tomorrow, perfect is the enemy of good, and all that ...

I'm working on related code, and got all the way there in one day 
(Friday) with a C++ tokenizer for linting purposes (doesn't open 
#includes or expand #defines etc; it wasn't meant to).

The core generated fragment that does the matching is at 
https://dpaste.de/GZY3.

The surrounding switch statement (also in library code) handles 
whitespace and line counting. The client code needs to handle by hand 
things like parsing numbers (note how the matcher stops upon the first 
digit), identifiers, comments (matcher stops upon detecting "//" or 
"/*") etc. Such things can be achieved with hand-written code (as I do), 
other similar tokenizers, DFAs, etc. The point is that the core loop 
that looks at every character looking for a lexeme is fast.


Andrei



More information about the Digitalmars-d mailing list