Container insertion and removal

Mon Mar 8 10:01:32 PST 2010

Fawzi Mohamed wrote:
> On 2010-03-07 14:23:03 +0100, Steven Schveighoffer <schveiguy at yahoo.com> 
> said:
> 
>> Robert Jacques Wrote:
>>
>>> On Sat, 06 Mar 2010 21:54:50 -0500, Steven Schveighoffer
>>> <schveiguy at yahoo.com> wrote:
>>>> On Sat, 06 Mar 2010 11:19:15 -0500, Robert Jacques <sandford at jhu.edu>
>>>> wrote:
>>>>
>>>>> On Sat, 06 Mar 2010 08:19:36 -0500, Steven Schveighoffer
>>>>> <schveiguy at yahoo.com> wrote:
>>>>>>
>>>>>> How can softRemove not affect iterating ranges?  What if the range is
>>>>>> positioned on the element removed?
>>>>>
>>>>> It always affects ranges in so far as the range and container are
>>>>> inconsistent, but that is a problem of softInsert as well. Removing an
>>>>> element from an array doesn't invalidate the underlying range, since
>>>>> the memory is still there. And so long as you're not trying to use 
>>>>> free
>>>>> lists, linked-lists, trees, etc. can be written so that the ranges
>>>>> never enter an invalid state. If they happen to be pointing at the
>>>>> removed node, they just end up stopping early.
>>>>
>>>> If my linked list range has two node pointers as the implementation, 
>>>> and
>>>> you remove the end node, it's going to end later, not early.
>>>
>>> A linked list maps to a forward range: so the range simply points to a
>>> single internal node. If you're going to store a pointer to the end, 
>>> then
>>> you should be using a doubly linked list, not a singly linked list.
>>
>> How do you know when to stop?  A range has a beginning and an ending, 
>> otherwise it's an iterator.  Whether you store it via a pointer to the 
>> last (not-to-be-iterated) node, or via a length, or via a value to 
>> compare with for stopping, you have to use something.  Or are you 
>> asserting the only useful range for a linked list iterates the entire 
>> list?
>>
>> Think of it as the equivalent of a slice of an array.
>>
>>>
>>>> Of course, you can get around this by just storing a current node and a
>>>> length, but that may not be what the user is expecting.  For 
>>>> example, if
>>>> you did a range of a sorted tree from "a" to "f" and it stops at "n", I
>>>> think that would be extremely unexpected.
>>>
>>> By this logic, any and all forms of mutation, not just insertions and
>>> deletions must not be allowed.
>>
>> Allowed, yes.  Not invalidating iterators, no.
> 
> I agree with this point of view, it makes sense to make some operations 
> invalidating iteration, and even there there is a hierarcy of 
> "invalidation":
> 
> 1) all iterator are still valid, and iterate on the same set that they 
> were iterating before. This is the best thing from a user, and is what 
> persistent structures do, all the structures described in the book I 
> gave have this property.
> To achieve this you need to basically *always* create a copy when 
> modifying a structure, but fortunately often there are ways to share 
> most of the structure with the old value. Still this has a cost, and 
> normally puts pressure on the memory/gc (all the nodes, or groups of 
> them have to be single objects for the gc).
> 
> 2) iterators are still valid, but they might iterate on a different set 
> of elements than originally planned.
> Array append with atomic update of elements gives this for example, 
> removal in general has more problems, even if the memory might still be 
> valid. Invariants of the array might be violated (for example strict 
> ordering), one could separate this in two sub points, invariant 
> violating and non, but probably it is not worth the effort.
> 
> 3) iterators are fully invalid, and at worst they might iterate on 
> completely wrong/invalid data without detection of failure, hopefully at 
> least in debug mode detection should be an option.

These are great thoughts. We need to explore this niche because STL has 
a black-and-white approach to invalidation - once invalidated, behavior 
is undefined. D could get much more nuanced than that.

Perhaps for the beginning we could just say that invalidation entails 
implementation-specific behavior.

>>>> Stopping early is invalidation also IMO.  If your program logic depends
>>>> on iterating over all the elements you originally intended to iterate,
>>>> then we have big problems if they stop early.
>>>
>>> Stopping early is the result of a logical error by the programmer. The
>>> code itself, however, is completely valid.
>>
>> I still fail to see the difference between "soft" operations and 
>> non-soft.  What does soft guarantee?  Give me a concrete definition, 
>> an example would help too.
> 
> I suppose soft should guarantee either 1 or 2, from what andrei was 
> saying I suppose he wants 2

I think softXxx functions that add data guarantee that existing ranges 
continue iterating the new collection correctly. Depending on where 
insertion was effected, iteration may or may not span the newly-inserted 
elements. For example, inserting elements in a linked list does preserve 
ranges so a linked list can afford to define softInsert and softPrepend.

softXxx functions that remove data should guarantee that existing ranges 
not including the removed elements will continue to operate on the range 
without seeing the removed elements. There is no guarantee that soft 
removal offers for ranges spanning the removed elements. (I'm unclear on 
this last topic.)

Andrei