Proposal for SentinelInputRange
Jonathan M Davis
jmdavisProg at gmx.com
Wed Feb 27 20:47:29 PST 2013
On Thursday, February 28, 2013 05:24:57 deadalnix wrote:
> Can you explain what it does buy us ? Is it faster to check for
> the sentinel value that to check for emptiness ?
>
> Is it really handy in generic code ? For instance, if I use a
> switch, the actual sentinel value may be conflicting with other
> element in the switch. How does this kind of problem are solved ?
> Do they are problems in the first place ?
In the case of a lexer, it allows you to skip checking for empty most of the
time. You have something like
switch(front)
{
case '+': ...
case '>': ...
...
case 0: //do whatever you do at the end
}
So, as far as the switch statement goes, you don't have to worry about whether
the range is empty or not. The case for 0 takes care of it and will only get
checked if it actually happens, whereas without the sentinel value, you have
to check empty every time you pop front.
Now, the only real benefit that I see for this allowing you to make a string
zero-terminated (which in the case of a lexer would probably mean copying the
entire file into a new string which has zero on the end). In general, you'll be
forced to wrap a range in a sentinel range to get this behavior, which means
that you're _still_ checking empty all the time, because it has to keep
checking whether it's supposed to make front 0 now. And that probably means
that it'll be slightly _more_ expensive to do this for anything other than a
string. That being the case, it might be better to just special case strings
rather than come up with this whole new range idea.
I'm also not at all covinced that this is generally useful. It may be that
it's a great idea for lexers, but what other use cases are there? I don't know
that there isn't one, but I can't think of any. And the _only_ time that it's
of any real benefit is definitely going to be when ranges are built with this
idea in the first place rather than wrapping them, which almost guarantees that
you're just dealing with arrays, which you can always just special case if you
need this level of efficiency.
Also, I'd point out that even for strings, doing something like this means
wrapping them, because their empty isn't defined in a manner which works with
isSentinelRange. Unlike most other cases, the wrapper wouldn't have to keep
checking for empty (it could just guarantee that the string it was given ended
with 0 and then say that it's length was one less than the underlying string),
but you're still dealing with a wrapper and relying on the compiler to
optimize that cost away. Also, that totally screws with passing the range to
any functions which special case strings. Those functions would now also have
to special case the wrapper type if you wanted the extra efficiency that you get
from the function understanding that it's a string rather than a range of code
units or code points (depending on which the sentinel range is claiming it is
- probably code units in the case of the lexer).
So, I'm inclined to believe that we'd be better off just special casing strings
in any algorithms that can take advantage of this sort of thing than we would
be creating this sentinel range idea.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list