RFC on range design for D2

Steven Schveighoffer schveiguy at yahoo.com
Tue Sep 9 15:26:30 PDT 2008


"Andrei Alexandrescu" wrote
> Steven Schveighoffer wrote:
>> "Andrei Alexandrescu" wrote
>> <snip>
>>
>> Excellent ideas!  I think the best part is about how you almost never 
>> need individual iterators, only ever ranges.  Perfectly explained!
>>
>> One issue that you might not have considered is using a container as a 
>> data structure, and not using it for algorithms.  For example, how would 
>> one specify that one wants to remove a specific element, not a range of 
>> elements.  Having a construct that points to a specific element but not 
>> any specific end element might be a requirement for non-algorithmic 
>> reasons.
>
> I'm sure you know and imply this, but just to clarify for everybody: 
> Modifying the topology of the container is a task carried by the 
> primitives of the container. Ranges can "look" at the topology and change 
> elements sitting in it, but not alter the topology.

I agree.  There are cases where just an iterator is necessary.  For example, 
a linked-list implementation where the length is calculated instead of 
stored.  But that is the exception, the rule should be that you always ask 
the container to alter the topology.

>
> Much like in STL, there's a container primitive for removing a range. It 
> will return a range too, namely the range starting at the deleted 
> position. Removing an element is really removing a range of one element - 
> just a particular case.

Yes, but the problem I see is how do you specify a range of one element.  In 
the case of an array, it is easy because you always know that no matter what 
happens to a container, the end of '1' element is a pointer to the next 
element in memory.  In the case of other containers, which could have 
changed since you obtained the range, you cannot be sure the 'range of 1' 
hasn't changed.  For instance, I would assume that a linked list range has 
two pointers, one to the first element, and one to the element just past the 
last element in the range.  But what if an element is inserted inbetween? 
Then your range suddenly got bigger.

What I'm trying to say is, maybe it would be desirable to have a pointer to 
exactly one element, instead of a range that could possibly change. 
Operations on that pointer type would be supported just like the operations 
on the ranges, but is more specific.

>
>> Also, some ranges might become invalid later on whereas the iterators 
>> would not.  Take for example a Hash container.  If you have to rehash the 
>> table, a range now might not make any sense, as the 'end' element may 
>> have moved to be before the 'begin' element.  But an iterator that points 
>> to a given element will still be valid (and could be used for removal). 
>> In fact, I don't think ranges make a lot of sense for things like Hash 
>> where there isn't any defined order.  But you still should have a 
>> 'pointer' type to support O(1) removal.
>
> Iterators can be easily defined over hashtables, but indeed they are 
> easily invalidated if implemented efficiently.

Iterators don't have to be invalidated, but ranges would.

>
>> One doesn't see any of these problems with arrays, because with arrays, 
>> you are guaranteed to have contiguous memory.
>>
>> What I'm trying to say is there may be a reason to have pointers for 
>> certain containers, even though they might be unsafe.
>
> A pointer to an element can be taken as &(r.first). The range may or may 
> not allow that, it's up to it.

I thought you stated that 'pointers' shouldn't be allowed, only ranges?  In 
general, I agree with that, but I think the ability to use a pointer type 
instead of ranges has advantages in some cases.

>
>> My personal pet peeve of many container implementations is not being able 
>> to remove elements from a container while iterating.  For example, I have 
>> a linked list of open file descriptors, iterate over the list, closing 
>> and removing those that are done (which should be O(1) per removal).  In 
>> many container implementations, iteration implies immutable, which means 
>> you have to add references to the elements to remove to another list to 
>> then remove afterwards (somtimes at the cost of O(n) per removal. 
>> grrrr.)  I hope ranges will support removal while traversing.
>
> In STL removing from a list while iterating is easy and efficient, albeit 
> verbose as always:
>
> list<Filedesc> lst;
> for (list<Filedesc>::iterator i = lst.begin(); i != lst.end(); )
> {
>     if (should_remove) i = lst.erase(i);
>     else ++i;
> }
>
> In Phobos things will be something like:
>
> List!(Filedesc) lst;
> for (auto r = lst.all; !r.isEmpty; )
> {
>     if (should_remove) r = lst.erase(take(1, r));
>     else r.next;
> }

I prefer the dcollections syntax, but I did write it, so that's to be 
expected :) :

foreach(ref doPurge, fd; lst.purger)
   doPurge = shouldIRemove(fd);

Or if you prefer iterator-style syntax:

for(auto i = lst.begin; i != lst.end;)
{
    if shouldIRemove(i.value) i = lst.remove(i);
    else ++i;
}

But yes, I see that ranges are used for removal, and that should be 
supported in ordered containers.  But the notion of storing a reference to a 
single element as a 'range of 1' in certain containers is troublesome I 
think.

-Steve 




More information about the Digitalmars-d-announce mailing list