Request for comments: std.d.lexer

Jonathan M Davis jmdavisProg at gmx.com
Fri Feb 8 01:40:53 PST 2013


On Friday, February 08, 2013 12:12:30 Dmitry Olshansky wrote:
> 08-Feb-2013 12:01, Jonathan M Davis пишет:
> > On Tuesday, February 05, 2013 22:51:32 Andrei Alexandrescu wrote:
> >> I think it would be reasonable for a lexer to require a range of ubyte
> >> as input, and carry its own decoding. In the first approximation it may
> >> even require a random-access range of ubyte.
> > 
> > Another big issue is the fact that in some ways, using a pointer like
> > dmd's
> > lexer does is actually superior to using a range. In particular, it's
> > trivial to determine where in the text a token is, because you can simply
> > subtract the pointer in the token from the initial pointer. Strings would
> > be okay too, because you can subtract their ptr properties. But the
> > closest that you'll get with ranges is to subtract their lengths, and the
> > only ranges that are likely to define length are random-access ranges.
> 
> Not true, certain ranges know length but can't be random access as
> indexing is O(lgN) or worse. Including a stripe of chunks as taken from
> file.

I said that the only ones which are "likely" to define length are random-access 
range. There _are_ other ranges which can, but in most cases, if you can know 
the length, you can do random access as well. Regardless, the main issue still 
stands in that dealing with keeping track of the index of the code unit of a 
token is more complicated and generally more expensive with ranges than it is 
with a pointer. Some range types will do better than others, but short of 
using a string's ptr property, there's always going to be some additional 
overhead in comparison to pointers to keep track of the indices or to keep a 
range or slice of one as part of a token. The pointer's just more lightweight. 
That doesn't make ranges unacceptable by any means. It just means that they're 
going to take at least a slight performance hit in comparison to pointers.

- Jonathan M Davis


More information about the Digitalmars-d mailing list