Safe to remove AA elements while iterating over it via .byKeyValue?

Steven Schveighoffer schveiguy at gmail.com
Mon Sep 28 14:58:15 UTC 2020


On 9/27/20 4:43 PM, Per Nordlöw wrote:
> On Sunday, 27 September 2020 at 14:23:11 UTC, H. S. Teoh wrote:
>> No.  Modifying a container while iterating over it is, in general, a 
>> bad idea (unless the container is designed to be used that way, but 
>> even then, such removal is generally restricted), because it often 
>> leads to highly counterintuitive results.  In the case of AA's, 
>> removing an element may lead to a rehashing, which reorders elements, 
>> and your iteration may miss some elements or repeat some elements or 
>> terminate prematurely. Even without a rehashing, you may encounter 
>> inconsistent behaviours, like some elements going "missing".
> 
> I believe it's high time we start thinking about detecting these 
> violations at compile-time. I recall it's in the spec somewhere so we 
> should start a deprecation process at least for AAs.

So the major problem of removing keys while iterating is that the AA can 
decide to rehash and shrink. If it does this, your iterator is 
invalidated -- you might skip elements that are actually in the AA, and 
you might encounter elements that have already been iterated. 
Essentially, the ordering has completely been shuffled. Aside from that, 
there's the performance aspect that you have to re-run the hash lookup 
for no reason (you literally have a pointer to the element you are 
iterating).

How do you "detect" this at compile time?

One could write a specific function to iterate and remove. I did it for 
dcollections (this was actually a very important functionality that I 
missed from C++ std::map, which is why I created dcollections in the 
first place).

I don't think it's worth worrying about this at compile time, or even at 
runtime. Just identify that if you do it, you are subject to peculiarities.

Then, we can introduce a specific mechanism to do this (total strawman, 
have no idea what the best thing looks like):

auto r = aa.byKeyValuePurging;
while(!r.empty)
{
    if(shouldPurge(r.front.key, r.front.value)) r.popFrontAndDelete;
    else r.popFront;
}

where popFrontAndDelete will move to the next element, remove the prior 
element, and ensure that a rehash does not occur.

Again, this is not a formal proposal. Just something like this could be 
possible.

In dcollections, I used a trick of opApply to provide the boolean of 
whether to remove as a parameter to foreach:

foreach(k, v, ref doPurge; &container.purge)
    doPurge = shouldPurge(k, v);

-Steve


More information about the Digitalmars-d-learn mailing list