std.d.lexer requirements
Christophe Travert
travert at phare.normalesup.org
Sat Aug 4 03:02:11 PDT 2012
Jonathan M Davis , dans le message (digitalmars.D:174191), a écrit :
> On Thursday, August 02, 2012 11:08:23 Walter Bright wrote:
>> The tokens are not kept, correct. But the identifier strings, and the string
>> literals, are kept, and if they are slices into the input buffer, then
>> everything I said applies.
>
> String literals often _can't_ be slices unless you leave them in their
> original state rather than giving the version that they translate to (e.g.
> leaving \© in the string rather than replacing it with its actual,
> unicode value). And since you're not going to be able to create the literal
> using whatever type the range is unless it's a string of some variety, that
> means that the literals often can't be slices, which - depending on the
> implementation - would make it so that that they can't _ever_ be slices.
>
> Identifiers are a different story, since they don't have to be translated at
> all, but regardless of whether keeping a slice would be better than creating a
> new string, the identifier table will be far superior, since then you only need
> one copy of each identifier. So, it ultimately doesn't make sense to use slices
> in either case even without considering issues like them being spread across
> memory.
>
> The only place that I'd expect a slice in a token is in the string which
> represents the text which was lexed, and that won't normally be kept around.
>
> - Jonathan M Davis
I thought it was not the lexer's job to process litterals. Just split
the input in tokens, and provide minimal info: TokenType, line and col
along with the representation from the input. That's enough for a syntax
highlighting tools for example. Otherwise you'll end up doing complex
interpretation and the lexer will not be that efficient. Litteral
interpretation can be done in a second step. Do you think doing litteral
interpretation separately when you need it would be less efficient?
--
Christophe
More information about the Digitalmars-d
mailing list