[Issue 2255] AA.remove() doesn't remove AA element when iterating using foreach

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Fri Sep 19 14:54:17 PDT 2014


https://issues.dlang.org/show_bug.cgi?id=2255

hsteoh at quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
                 CC|                            |hsteoh at quickfur.ath.cx
         Resolution|---                         |WONTFIX

--- Comment #8 from hsteoh at quickfur.ath.cx ---
Modifying a container while iterating over it will always have counterintuitive
semantics (not to mention potential memory-unsafety), because it violates the
very idea of iteration from beginning to end. The only case where something
like this will even remotely work, is if (1) you only ever remove the current
element being iterated over; (2) the container is explicitly implemented to
support this operation.

Consider a simple linked list, root -> A -> B -> C -> D.

Intuitively speaking, the iteration order should be A, B, C, D. Now suppose
you're iterating over the list, and you're on item A. Suppose you delete item
A. Intuitively speaking, the next iteration of the loop should continue with B.
However, that isn't what happens, because when A was removed from the list, the
.next pointer is nulled, so your overall iteration is actually only A (instead
of A, B, C, D).

You may attempt to fix this by caching the .next pointer before executing the
loop body. However, this still exhibits counterintuitive behaviour: suppose
while you're iterating over A, you decide to delete B. Intuitively speaking,
the next iteration should be on C. However, since .next was cached, the next
iteration is actually on B, and since B was removed from the list and B.next
has been nulled, your overall iteration sequence is actually A, B (instead of
the expected A, C, D).

You may attempt to fix this by caching the entire iteration sequence in an
array before running the loop body on each item, but even that still doesn't
save you: if while you're iterating over A, you decide to delete C and D, then
your iteration sequence is still A, B, C, D (as opposed to the expected
sequence A, B), because C and D were cached in the array.

Now, this is just with linked lists alone, which, relatively speaking, is
rather tame. When you're dealing with AA's, a whole can o' worms is opened when
you modify the AA in the middle of iteration. For example, consider what
happens if a new key is inserted into the AA while you're iterating over it.
Should the new key be included in the current iteration over the AA, or not?
Well, regardless of what it *should* be, what actually happens will depend on
how it's implemented. If the new key hashes to a position past your current
iteration point, then it will be included in your current walk over the AA;
otherwise it won't. And by definition of hash functions, this is unpredictable.

Even worse yet, consider what happens if during iteration over an AA, you add a
new key, and that triggers a rehash. Now what happens will depend on the
implementation details of the AA. If you're lucky, you'll end up with the
current walk over the AA encountering items that were previously encountered,
and perhaps skipping over other items that should've been encountered
eventually. If you're unlucky, your loop will crash because the iteration
pointer got invalidated by the rehash. Same thing goes for deleting keys from
AA's while iterating over them.

Basically, there is no way to get intuitive behaviour out of iterating over a
container that's changing mid-iteration, except for the one case described in
the beginning -- which requires the container to explicitly support such an
operation.

The above analysis applies to virtually all containers, not just AA's, so I
submit that this bug is unfixable.

--


More information about the Digitalmars-d-bugs mailing list