Proposal for SentinelInputRange

Walter Bright newshound2 at digitalmars.com
Wed Feb 27 21:10:15 PST 2013


On 2/27/2013 8:47 PM, Jonathan M Davis wrote:
> Now, the only real benefit that I see for this allowing you to make a string
> zero-terminated (which in the case of a lexer would probably mean copying the
> entire file into a new string which has zero on the end).

Nawp, there is no extra copy. Take a look at the compiler source. You have to 
read the file into memory anyway - so make the buffer one byte longer, and put a 
0 at the end.

> In general, you'll be
> forced to wrap a range in a sentinel range to get this behavior, which means
> that you're _still_ checking empty all the time, because it has to keep
> checking whether it's supposed to make front 0 now. And that probably means
> that it'll be slightly _more_ expensive to do this for anything other than a
> string. That being the case, it might be better to just special case strings
> rather than come up with this whole new range idea.

Nope, not necessary to wrap it.


> I'm also not at all covinced that this is generally useful. It may be that
> it's a great idea for lexers, but what other use cases are there?

Anything that walks a C string. Lots of cases for that. Sentinels are often used 
where high speed processing of data is desired. Google sentinel-terminated data 
for more examples.


> Also, I'd point out that even for strings, doing something like this means
> wrapping them, because their empty isn't defined in a manner which works with
> isSentinelRange.

For D strings, yes, for C strings, no need to wrap them.

> So, I'm inclined to believe that we'd be better off just special casing strings
> in any algorithms that can take advantage of this sort of thing than we would
> be creating this sentinel range idea.

0 terminated C strings are a classic case of this. Another case is a token 
stream ending with an EOF token.



More information about the Digitalmars-d mailing list