std.d.lexer requirements
Jonathan M Davis
jmdavisProg at gmx.com
Fri Aug 3 19:31:22 PDT 2012
On Thursday, August 02, 2012 11:08:23 Walter Bright wrote:
> The tokens are not kept, correct. But the identifier strings, and the string
> literals, are kept, and if they are slices into the input buffer, then
> everything I said applies.
String literals often _can't_ be slices unless you leave them in their
original state rather than giving the version that they translate to (e.g.
leaving \© in the string rather than replacing it with its actual,
unicode value). And since you're not going to be able to create the literal
using whatever type the range is unless it's a string of some variety, that
means that the literals often can't be slices, which - depending on the
implementation - would make it so that that they can't _ever_ be slices.
Identifiers are a different story, since they don't have to be translated at
all, but regardless of whether keeping a slice would be better than creating a
new string, the identifier table will be far superior, since then you only need
one copy of each identifier. So, it ultimately doesn't make sense to use slices
in either case even without considering issues like them being spread across
memory.
The only place that I'd expect a slice in a token is in the string which
represents the text which was lexed, and that won't normally be kept around.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list