[Issue 4179] [AA] Deleting items from an associative array iterated over

d-bugmail at puremagic.com d-bugmail at puremagic.com
Tue Dec 17 17:24:35 PST 2013


https://d.puremagic.com/issues/show_bug.cgi?id=4179



--- Comment #11 from hsteoh at quickfur.ath.cx 2013-12-17 17:24:18 PST ---
See discussion on issue #10821.

Modifying a container while iterating over it is tricky business; generally, it
leads to a lot of counterintuitive behaviours. This is not just specific to
AA's; it applies to many containers that are not designed to support such an
operation. For example, let's say we have linked list that contains the
following elements:

    HEAD --> A --> B --> C --> D --> E --> null

Under a normal iteration scheme, the iteration order would be A, B, C, D, E.
Now, suppose we iterate over this list and when we're at C, we remove it from
the list. The list now looks like this:

    HEAD --> A --> B --> D --> E --> null

Now, in theory, the next element in our iteration should be D, since it comes
after C. However, since the current pointer points to C, and the list .remove
method has set it to null (because it has relinked D to follow B), pointer.next
is now null, and our iteration stops prematurely.

Suppose we solve this problem by keeping a copy of pointer.next before we enter
the loop body. Then the above case works, but a different case fails: suppose
when we're at C, and we delete D. The list then becomes:

    HEAD --> A --> B --> C --> E --> null

In theory, the next step of our iteration should be E. But since we kept a copy
of the original value of C.next, it is still pointing at D, so in the next
iteration of our loop, we will end up at D, even though it has already been
deleted from the list. Worse yet, D.next has been set to null, because it's no
longer in the list (E is now the successor of C). So then our iteration stops
after D, and we get the counterintuitive iteration sequence A, B, C, D instead
of the expected A, B, C, E.

These are just two simple cases illustrating the counterintuitiveness of
modifying a container as you iterate over it. There are many other such cases
(I'm sure you can think of more quite easily). Even with a simple structure
like linked lists, there's already strange cases and odd behaviours. Now
consider what happens if you're in the middle of iterating over an AA, and then
you decide to add/remove a key, and it just so happens to cause a rehash to
happen. Now all of your .next pointers are wrong, and your subsequent iteration
will basically end up in a strange random order that may skip some elements or
see elements that should have already been deleted. Rehash is a pretty drastic
example, but even with a simple insert, strange things can happen: if the new
key is hashed to a bucket past the point of your current iteration, then your
iteration will see the new key eventually; but otherwise, you'll miss the new
key. Since the hash function is random, this means you'll sometimes see newly
inserted keys in your iteration, sometimes you won't, and this is
unpredictable.

OK, you say, so instead of using .byKey, let's use .keys, which creates a copy
of the current set of keys, then use that to iterate over the AA. This will
make it resistant to odd behaviour caused by AA modifications during iteration.
But if you delete some keys from the AA, the array of keys that you're
iterating over isn't updated, so you may end up getting RangeErrors when you
get to those stale keys.

Basically, if you allow arbitrary modification of the container while you
iterate over it, you will always have strange cases that cause unexpected (or
counterintuitive) behaviours.

With respect to deleting keys when iterating over a container, there's only one
case where "intuitive" behaviour is guaranteed: the container must be designed
to support deletion during iteration, and deletion is restricted to only the
current element in the iteration. In this case, the rest of the container
remains (conceptually) unchanged, so as long as the container itself has the
necessary support mechanisms for this operation, it will result in the expected
behaviour (your iteration won't stop prematurely, you won't see already-deleted
items, etc.).

If the container does not support such operations, or if you delete something
other than the current element in the iteration, then there will always be
cases where it exhibits counterintuitive behaviour. You can only get around
this with workarounds like maintaining a list of postponed adds/deletes, which
get applied after the iteration is finished. But any time you directly modify a
container while iterating over it, be prepared to handle "odd" behaviours.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list