RFC on range design for D2
Benji Smith
dlanguage at benjismith.net
Thu Sep 11 02:42:29 PDT 2008
Bill Baxter wrote:
> On Thu, Sep 11, 2008 at 1:31 PM, Benji Smith <dlanguage at benjismith.net> wrote:
>> Parsers & regex engines move both forward and backward, as they try to match
>> characters to a pattern.
>>
>> Really, anything that uses an NFA or DFA to define patterns would benefit
>> from fast bidirectional iteration...
>
> Good call. I was about to post something mentioning that Turing
> machines but that seemed too academic. Same class of thing as
> NFA/DFA/FSM.
>
> The question is, though, would you really implement those things using
> a linked list? I would expect most of those suckers work on arrays,
> and so can take advantage of the bidirectional nature of random access
> ranges.
Actually, Perl 6 will (assuming they ever finish it) finally allow regex
matching against input streams:
http://dev.perl.org/perl6/doc/design/syn/S05.html
It's a big document. Search for the text "Matching against non-strings"
This kind of thing was one of the main arguments I made in my "Why
Strings as Classes" thread, that got everyone's panties in a bunch and
that no one else agreed with.
In that thread, I argued that Strings should be objects so that they can
implement a CharSequence interface (or something like it). And then all
the standard text-processing stuff could be written against that
interface, allowing regex engines and parsers to be agnostic about
whether they read from an actual in-memory string or from a
(file|database|socket) input stream.
With the range proposal on the table, I'd be just as happy if all the D
text-processing stuff in the standard libs was implemented against a
Range!(T), where T is one of the char types. Especially if ranges can be
infinite.
Bill Baxter wrote:
> Hmm, for FSMs you can't really define a good end state. There may not
> be any particular end state. ... ah, but wait I forgot. That's the
> beauty of a range -- the end "state" doesn't have to be a "state" per
> se. It can be any predicate you want it to be. "Range" is misleading
> in this case. This is one of those cases where you just have to
> remember "range" means "current value plus stopping criterion".
That's what I was saying earlier.
I think the mechanics are good. And for contiguous, sequential
containers, the word "range" is great. For other types of containers, or
other selection/iteration scenarios, you can shoehorn your mental model
into the "range" metaphor. But it's weird.
--benji
More information about the Digitalmars-d-announce
mailing list