Request for comments: std.d.lexer

Mon Jan 28 03:59:25 PST 2013

28-Jan-2013 15:48, Johannes Pfau пишет:
> Am Sun, 27 Jan 2013 11:48:23 -0800
> schrieb Walter Bright <newshound2 at digitalmars.com>:
>
>> On 1/27/2013 2:17 AM, Philippe Sigaud wrote:
>>> Walter seems to think if a lexer is not able to vomit thousands
>>> of tokens a seconds, then it's not good.
>>
>> Speed is critical for a lexer.
>>
>> This means, for example, you'll need to squeeze pretty much all
>> storage allocation out of it.
>
> But to be fair that doesn't fit ranges very well. If you don't want to
> do any allocation but still keep identifiers etc in memory this
> basically means you have to keep the whole source in memory and this is
> conceptually an array and not a range.
>

Not the whole source but to construct a table of all identifiers. The 
source is awfully redundant because of repeated identifiers, spaces, 
comments and what not. The set of unique identifiers is rather small.

I think the best course of action is to just provide a hook to trigger 
on every identifier encountered. That could be as discussed earlier a 
delegate.

> But you can of course write a lexer which accepts buffered ranges and
> does some allocation for those and is special cased for arrays to not
> allocate at all. (Unbuffered ranges should be supported using a
> generic BufferedRange)
>

-- 
Dmitry Olshansky