What would need to be done to get sdc.lexer to std.lexer quality?

Walter Bright newshound2 at digitalmars.com
Wed Aug 1 22:35:50 PDT 2012


On 8/1/2012 10:31 PM, Jakob Ovrum wrote:
> On Thursday, 2 August 2012 at 04:38:11 UTC, Walter Bright wrote:
>> That's just not going to produce a high performance lexer.
>>
>> The way to do it is in the Lexer instance, have a value which is the current
>> Token instance. That way, in the normal case, one NEVER has to allocate a
>> token instance.
>>
>> Only when lookahead is done is storage allocation required, and that list
>> should be held by Lexer and recycled as tokens get consumed. This is how the
>> dmd lexer works.
>>
>> Doing one allocation per token is never going to scale to trying to shove
>> millions upon millions of lines of code through it.
>
> Which is exactly why I'm pointing out the current, poor approach. Having a
> single array with contiguous Tokens for lookahead is completely doable even when
> Token is a class with some simple GC.malloc and emplace composition. I think
> SDC's Token class is too big to be useful as a struct, you'd pretty much never
> want to pass it anywhere by value.

Using a class implies an extra level of indirection, and the other issue is the 
only point to using a class is if you're going to derive from it and override 
its methods. I don't see that for a Token.

Use pass-by-ref for the Token.



More information about the Digitalmars-d mailing list