to invalidate a range

Jonathan M Davis jmdavisProg at gmx.com
Fri Aug 12 16:34:11 PDT 2011


On Friday, August 12, 2011 16:16 Ellery Newcomer wrote:
> On 08/12/2011 05:51 PM, Jonathan M Davis wrote:
> > An implementation can guarantee it as long as your range doesn't directly
> > point to an element being removed (i.e. as long as the element isn't on
> > the ends - or maybe one past the end, depending on the implementation).
> > But _no_ container can guarantee that an iterator or range which
> > directly references an element which is removed is going to stay valid -
> > not without playing some serious games internally which make iterators
> > and ranges too inefficent, and possibly not even then. So, stableRemove
> > is only going to guarantee that a range stays valid on as long as the
> > end points of that range aren't what was being removed.
> 
> Forgive my being dense, but where is this 'as long as' coming from? If
> your range only points to ends in e.g. a linked list, how is it supposed
> to retrieve elements in the middle? I'm having a hard time visualizing a
> range over a node based container that doesn't point to a node in the
> middle (at some point in time). The range points to the node to retrieve
> the current front quickly, the node can get removed, the removed
> function won't know its invalidating the range without playing yon
> internal games, ergo stable remove cannot be.

Are you familiar with iterators? This will be a lot easier if you are. An 
iterator points to one element and one element only. In C++, you tend to pass 
around pairs of iterators - one pointing to the first element in a range of 
elements and one pointing to one past the end. You then usually iterate by 
incrementing the first iterator until it equals the second.

Ranges at their most basic level are a pair of iterators - one pointing to the 
first element in the range and one pointing either to the last element or one 
past that, depending on the implementation. The range API and concept is 
actually much more flexible than that, allowing us to implement stuff like the 
fibonacci sequence as a range, but when it comes to containers, they're almost 
certainly doing what C++ by using two iterators, except that it's wrapped them 
so that you don't ever have to worry about them pointing to separate 
containers or the first iterator actually being past the second one. Wrapping 
them as a range makes it much cleaner, but ultimately, for containers at 
least, you're still going to have those iterators internally.

So, front returns the element that the first iterator points to and popFront 
just increments that iterator by one. If the range has back, then back points 
to the last element of the range (though the internal iterator may point one 
elment past that), and popBack decrements that iterator by one. It doesn't 
directly refer to _any_ elements in the middle.

So, in a node-based container, adding or removing elements in the middle of 
the range will have no effect on the internal iterators. It'll have an effect 
on what elements you ultimately iterate over if you iterate over the range 
later, but the two iterators are still completely valid and point to the same 
elements that they always have. However, if you remove either of the elements 
that the iterators point to, _then_ the range is going to be invalidated, 
because its iterators are invalid. They don't point to valid elements in the 
container anymore.

In a contiguous container, such as Array, adding or removing elements is more 
disruptive, since the elements get copied around inside of the contiguous 
block of memory, and while the iterators may continue to point at the same 
indices as before, the exact elements will have shifted. So, they're still 
valid in the sense that they point to valid elements, but they don't point to 
the same elements. The two places where they end up no longer pointing to 
valid elements are when you remove enough elements that one or both iterators 
points at an index which is greater than the size of the container and when 
the container has to be reallocated (typically when you append enough elements 
that it no longer has the capacity to resize in place). So, in general, 
altering contiguous containers, such as Array, risks invalidating all 
iterators or ranges which point to them, whereas with node-based containers 
it's only when the end point of a range gets removed that you run into that 
kind of trouble.

Really, to understand range invalidation, you probably should have a fair 
understanding of iterators. But again, as long as you don't keep any ranges 
around when you alter a container by adding or removing elements from it, you 
don't have anything to worry about.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list