std.d.lexer requirements

deadalnix deadalnix at gmail.com
Mon Aug 6 11:03:18 PDT 2012


Le 04/08/2012 15:45, Dmitry Olshansky a écrit :
> On 04-Aug-12 15:48, Jonathan M Davis wrote:
>> On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote:
>>> I see it as a compile-time policy, that will fit nicely and solve both
>>> issues. Just provide a templates with a few hooks, and add a Noop policy
>>> that does nothing.
>>
>> It's starting to look like figuring out what should and shouldn't be
>> configurable and how to handle it is going to be the largest problem
>> in the
>> lexer...
>>
>
> Let's add some meat to my post.
> I've seen it mostly as follows:
>
> //user defines mixin template that is mixed in inside lexer
> template MyConfig()
> {
> enum identifierTable = true; // means there would be calls to
> table.insert on each identifier
> enum countLines = true; //adds line, column properties to the lexer/Tokens
>
> //statically bound callbacks, inside one can use say:
> // skip() - to skip a char (popFront)
> // get() - to read next char (via popFront, front)
> // line, col - as readonly properties
> // (skip & get do the counting if enabled)
>
> bool onError()
> {
> skip(); //the most dumb recovery, just skip a char
> return true; //go on with tokenizing, false - stop prematurely
> }
>
> ...
> }
>
> usage:
>
>
> {
> auto my_supa_table = ...; //some kind of container (should a set on
> strings and support .insert("blah"); )
>
> auto dlex = Lexer!(MyConfig)(table);
> auto all_tokens = array(dlex(joiner(stdin.byChunk(4096))));
>
> //or if we had no interest in table but only tokens:
> auto noop = Lexer!(NoopLex)();
> ...
> }
>

It seems way too much.

The most complex thing that is needed is the policy to allocate 
identifiers in tokens. It can be made by passing a function that have a 
string as parameter and a string as return value. The default one would 
be an identity function.

The second parameter is a bool to tokenize comments or not. Is that enough ?

The onError look like a typical use case for conditions as explained in 
the huge thread on Exception.


More information about the Digitalmars-d mailing list