std.d.lexer requirements

deadalnix deadalnix at gmail.com
Sat Aug 4 01:57:53 PDT 2012


Le 03/08/2012 21:59, Walter Bright a écrit :
> On 8/3/2012 6:18 AM, deadalnix wrote:
>> lexer can have a parameter that tell if it should build a table of
>> token or
>> slice the input. The second is important, for instance for an IDE :
>> lexing will
>> occur often, and you prefer slicing here because you already have the
>> source
>> file in memory anyway.
>
> A string may span multiple lines - IDEs do not store the text as one
> string.
>
>> If the lexer allocate chunks, it will reuse the same memory location
>> for the
>> same string. Considering the following mecanism to compare slice, this
>> will
>> require 2 comparaisons for identifier lexed with that method :
>>
>> if(a.length != b.length) return false;
>> if(a.ptr == b.ptr) return true;
>> // Regular char by char comparison.
>>
>> Is that a suitable option ?
>
> You're talking about doing for strings what is done for identifiers -
> returning a unique handle for each. I don't think this works very well
> for string literals, as there seem to be few duplicates.

That option have the benefice to allow very fast identifier comparison 
(like DMD does) but don't impose it. For instance, you could use that 
trick in a single thread, but another identifier table for another.

It allow to avoid completely the problem with multithreading you 
mention, while keeping most identifiers comparison really fast.

It allow also for several allocation scheme for the slice, that fit 
different needs, as shown by Christophe Travert.


More information about the Digitalmars-d mailing list