RFC on range design for D2

Thu Sep 11 04:25:08 PDT 2008

Sergey Gromov wrote:
> Benji Smith <dlanguage at benjismith.net> wrote:
>> Sergey Gromov wrote:
>>> Benji Smith <dlanguage at benjismith.net> wrote:
>>>> Bill Baxter wrote:
>>>>> Hmm, for FSMs you can't really define a good end state.  There may not
>>>>> be any particular end state. ... ah, but wait I forgot.  That's the
>>>>> beauty of a range -- the end "state" doesn't have to be a "state" per
>>>>> se.  It can be any predicate you want it to be.  "Range" is misleading
>>>>> in this case.  This is one of those cases where you just have to
>>>>> remember "range" means "current value plus stopping criterion".
>>>> That's what I was saying earlier.
>>>>
>>>> I think the mechanics are good. And for contiguous, sequential 
>>>> containers, the word "range" is great. For other types of containers, or 
>>>> other selection/iteration scenarios, you can shoehorn your mental model 
>>>> into the "range" metaphor. But it's weird.
>>> It seems to me like a misuse of ranges.  Do you really want to iterate 
>>> over a state machine?  FSM is a mailbox with a 'message' hole.  You put 
>>> messages into it and it does things.  How do you iterate over a mailbox?
>> Well, no. Not when you put it like that.
>>
>> The example I posted earlier went something like this:
>>
>>     MarkovModel<ApplicationState> model = ...;
>>     for (ApplicationState state : model) {
>>        state.doStuff();
>>     }
>>
>> It's not a bad abstraction. The model handles all of the semantics of 
>> calculating the transition probabilities and selecting the next state, 
>> so that the foreach loop doesn't have to muss with those details.
>>
>> Yeah, it's a total misuse of the "range" metaphor, and that's exactly 
>> what I'm saying. In Java, where I implemented this project, an 
>> "iterator" is a tiny bit of logic for returning objects in a 
>> (potentially endless, potentially reversible) sequence, primarily to 
>> support looping constructs. Just because there's no underlying range of 
>> objects doesn't mean they're not iterable.
>>
>> Of course, Java iterators are *much* more limited constructs than these 
>> new D ranges. But I still think the concept has merit. And, like you 
>> said, calling them ranges makes them seem stupid. Because they're not 
>> ranges.
> 
> Well, if you get an object out of there on every step, and that object 
> source can exhaust at some point, then the abstraction is correct.
> 
> I agree that an input range is actually an arbitrary bounded iterator.  
> But you also must agree that a random access iterator in C++ is actually 
> an unbounded array.  You always can invent a better name for any 
> particular case.  But C++ keeps calling them iterators to pronounce 
> generocity and emphasize interchangeability to some degree.  You may not 
> notice that calling a string pointer an iterator is a bit awkward and 
> misleading, because you get used to it and learned to think that way.
> 
> There's no difference with ranges.  Some of them are actual ranges, some 
> are not, some are plain abstractions.  You need to learn to think in 
> ranges to use them naturally.  This is true for any new language you're 
> learning, programming or human, if you want to use them efficiently.

I agree 100%, and also with Sergey's other post that some abstractions 
simply don't fit the range charter, or don't fit it naturally, or are 
not expressive enough for some rich iteration abstraction.

Maybe ranges are lousy for an HMM, but then does look like I want to run 
a host of generic algorithms against an HMM? That IS the question. 
Probably not. HMMs have their own algorithms, and I wouldn't think of 
finding/sorting/partitioning an HMM just as I wouldn't think of applying 
Viterbi to an array.

What I wanted was to make sure ranges are appropriate as higher-level 
abstractions that can replace STL-like iterators. My experience shows 
that they can. Not on 100% of occasions have they been a superior 
replacement, but I'm looking at a solid 80s at least. Add to that the 
advantage of better generators (which iterators make unpalatable because 
of the unsightly dummy end() requirement). When I also add the safety 
advantage of sinks (no more buffer overruns!!!), I feel we have a huge 
winner.

Of course, that doesn't mean ranges should be the be-all end-all of 
iteration.

This discussion reminds me of the "iterator craze" around 2000. People 
were discovering STL iterators and were trying to define and use the 
weirdest iterators. I remember distinctly a one-page ad on Dr. Dobb's 
Journal. They were looking for article writers. They mentioned the 
upcoming themes (e.g. networking, security, patterns, C++...). There was 
a _specific_ note: "Not interested in yet another iterator article".

That being said, damn I wish I had the time to make RegEx faster AND 
operating on input ranges...

Andrei