Proposal for SentinelInputRange
Jonathan M Davis
jmdavisProg at gmx.com
Wed Feb 27 23:55:11 PST 2013
On Wednesday, February 27, 2013 23:33:09 Walter Bright wrote:
> On 2/27/2013 9:28 PM, Jonathan M Davis wrote:
> > But you have to deal with D strings, not C strings if you're dealing with
> > ranges. char* isn't a range. So, unless you're talking about wrapping a
> > char* in a range, char* isn't going to work. And simply appending 0 to
> > the end of a D string isn't enough, because isSentinelnputRange would
> > fail, because std.array.empty doesn't match it. So, you need a wrapper
> > even if it's only to pass the template constraint. That being the case,
> > regardless of whether you're dealing with char* or string, you need a
> > wrapper.
>
> Again, please see how lexer.c works. I assure you, there is no double
> copying going on, nor is there a double test for the terminating 0.
I know what the lexer does, and remember that it _doesn't_ operate on ranges,
and there are subtle differences between being able to just use char* and
trying to handle generic ranges.
And no, the lexer doesn't have a double test. The place you're going to be
stuck with a double test is most any range which isn't a string, because such
ranges won't have sentinel values, and there will be no way to add them (as
you really can't append to ranges), and so they'll end up being wrapped in a
SentinelRange which will have to check on each popFront whether the wrapped
range is now empty making it so that front needs to be the sentinel value. And
most any range which _was_ designed to have a sentinel value would have to be
managing its own contents (because otherwise, it would just be back to
wrapping a range and having to check empty), which likely means that it'll
just be a thin wrapper around a string or array anyway.
Strings will still need to be wrapped, because they won't pass isSentinelRange
otherwise, but they won't get any extra checks, because the wrapper can just
check for 0 on the end and append it if it's not there.
> >So, why not just special case strings or arrays in the few situations
> >
> > where something like this is needed, especially when it would be so easy
> > to
> > do?
>
> Sentinels structure the code differently.
Given how a lexer works (and I have been working on a lexer off and on
recently), the only real difference is that you'd just use a couple of static
ifs like
static if(!isSomeString!R)
{
if(range.empty)
break; //or whatever you do at the end
}
static if(isSomeString!R)
{
case 0:
break; //or whatever you do at the end
}
So, in the case of a lexer, I don't see sentinel ranges as buying us much. You
end up having to wrap most any range that you pass to the lexer or whatever
(including strings so that they'll pass isSentinelRange), you lose out on any
optimizations of any functions that you call which special-case strings
(though there probably wouldn't be many of those in a lexer), and all you
avoid is a couple of static ifs.
The idea of sentinels certainly isn't useless, but anything caring about that
sort of speed is likely to just use strings or arrays, and those can trivially
be special cased to avoid unnecessary empty checks and to add the check for
the sentinel, making the whole sentinel range idea an unnecessary complication
IMHO.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list