Container insertion and removal

Sun Mar 7 08:18:36 PST 2010

On 2010-03-07 14:23:03 +0100, Steven Schveighoffer <schveiguy at yahoo.com> said:

> Robert Jacques Wrote:
> 
>> On Sat, 06 Mar 2010 21:54:50 -0500, Steven Schveighoffer
>> <schveiguy at yahoo.com> wrote:
>>> On Sat, 06 Mar 2010 11:19:15 -0500, Robert Jacques <sandford at jhu.edu>
>>> wrote:
>>> 
>>>> On Sat, 06 Mar 2010 08:19:36 -0500, Steven Schveighoffer
>>>> <schveiguy at yahoo.com> wrote:
>>>>> 
>>>>> How can softRemove not affect iterating ranges?  What if the range is
>>>>> positioned on the element removed?
>>>> 
>>>> It always affects ranges in so far as the range and container are
>>>> inconsistent, but that is a problem of softInsert as well. Removing an
>>>> element from an array doesn't invalidate the underlying range, since
>>>> the memory is still there. And so long as you're not trying to use free
>>>> lists, linked-lists, trees, etc. can be written so that the ranges
>>>> never enter an invalid state. If they happen to be pointing at the
>>>> removed node, they just end up stopping early.
>>> 
>>> If my linked list range has two node pointers as the implementation, and
>>> you remove the end node, it's going to end later, not early.
>> 
>> A linked list maps to a forward range: so the range simply points to a
>> single internal node. If you're going to store a pointer to the end, then
>> you should be using a doubly linked list, not a singly linked list.
> 
> How do you know when to stop?  A range has a beginning and an ending, 
> otherwise it's an iterator.  Whether you store it via a pointer to the 
> last (not-to-be-iterated) node, or via a length, or via a value to 
> compare with for stopping, you have to use something.  Or are you 
> asserting the only useful range for a linked list iterates the entire 
> list?
> 
> Think of it as the equivalent of a slice of an array.
> 
>> 
>>> Of course, you can get around this by just storing a current node and a
>>> length, but that may not be what the user is expecting.  For example, if
>>> you did a range of a sorted tree from "a" to "f" and it stops at "n", I
>>> think that would be extremely unexpected.
>> 
>> By this logic, any and all forms of mutation, not just insertions and
>> deletions must not be allowed.
> 
> Allowed, yes.  Not invalidating iterators, no.

I agree with this point of view, it makes sense to make some operations 
invalidating iteration, and even there there is a hierarcy of 
"invalidation":

1) all iterator are still valid, and iterate on the same set that they 
were iterating before. This is the best thing from a user, and is what 
persistent structures do, all the structures described in the book I 
gave have this property.
To achieve this you need to basically *always* create a copy when 
modifying a structure, but fortunately often there are ways to share 
most of the structure with the old value. Still this has a cost, and 
normally puts pressure on the memory/gc (all the nodes, or groups of 
them have to be single objects for the gc).

2) iterators are still valid, but they might iterate on a different set 
of elements than originally planned.
Array append with atomic update of elements gives this for example, 
removal in general has more problems, even if the memory might still be 
valid. Invariants of the array might be violated (for example strict 
ordering), one could separate this in two sub points, invariant 
violating and non, but probably it is not worth the effort.

3) iterators are fully invalid, and at worst they might iterate on 
completely wrong/invalid data without detection of failure, hopefully 
at least in debug mode detection should be an option.

>>> Stopping early is invalidation also IMO.  If your program logic depends
>>> on iterating over all the elements you originally intended to iterate,
>>> then we have big problems if they stop early.
>> 
>> Stopping early is the result of a logical error by the programmer. The
>> code itself, however, is completely valid.
> 
> I still fail to see the difference between "soft" operations and 
> non-soft.  What does soft guarantee?  Give me a concrete definition, an 
> example would help too.

I suppose soft should guarantee either 1 or 2, from what andrei was 
saying I suppose he wants 2

This information is very important for the user for example to iterate 
+manipulate (addition/removal), one doesn't necessarily need to do a 
copy with all containers.

In general one wants to have clean interface, + just one or two "good" 
implementations, I don't feel that there is a strong need to have many 
implementations of the basic containers in a standard library, (the 
other would be just implemented for a specific application, if needed). 
Then there might be different containers (db, distributed hashes, 
xml,...), and it would be nice if those can be as close as possible in 
some aspects of the api to the standard containers.
Again here generic algorithms are useful to augment the usefulness of 
special containers with a minimal coding effort, but should not 
"disallow" ad-hoc implementation in the container.