[Issue 19511] New: Appender does not create a destructable array

Mon Dec 24 17:30:31 UTC 2018

https://issues.dlang.org/show_bug.cgi?id=19511

          Issue ID: 19511
           Summary: Appender does not create a destructable array
           Product: D
           Version: D2
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: phobos
          Assignee: nobody at puremagic.com
          Reporter: schveiguy at yahoo.com

std.array.Appender does not set up the array for GC destruction. While this is
a complicated issue to solve, it's certainly not expected from the user.

An example to demonstrate:

import std.array;
import std.stdio;
import core.memory;

__gshared bool incollect = false;

struct S
{
    int i;
    ~this()
    {
        if(incollect)
                writeln("S dtor ", i);
    }
}
S[] foo()
{
    Appender!(S[]) app;
    foreach(i; 0 .. 100)
        app ~= S(i);
    return app.data;
}
void main()
{
    foreach(i; 0 .. 100)
    {
        foo();
        incollect = true;
        GC.collect();
        incollect = false;
    }
    incollect = true;
}

The incollect bool is to prevent printing in the destructor calls from when the
stack-stored S's go out of scope.

In this case, we see no printouts.

--------------

The solution is not as obvious as it should be. The Appender is supposed to be
fast, even faster than the runtime appending. For this to work, it needs to
avoid any lookups or searches through the GC for normal operations. But the
reality is that the only way the GC will call the destructors is if the GC
finalize and appendable flags are set, and the array capacity is properly set.

At the moment, we currently set the capacity of a passed-in GC array to the max
possible. While this guarantees any D-runtime based appending will reallocate,
it also makes the GC call destructors for all items in the memory block, even
ones we didn't initialize. But any reallocations or new allocations are not
done this way.

A possible solution is to change the way we allocate the array to a normal
array allocation (i.e. new T[]) instead of using GC.qalloc, and then set it to
max capacity each time. While a bit more expensive, because we are growing in a
exponential fashion, the additional growth is minimized.

This will prevent GC lookups for appending or clearing the data, but still
provide GC destruction for elements that are in array.

The other thing that must be done is to zero out items that are destroyed, so
the destructor is not called on a fully initialized element.

-------------

Another solution is to only initialize the runtime capacity and info when we
know the appender is done with the data. This means when leaving behind an
array (reallocating), and when we know there are no more appender references to
the data (e.g. ref count the impl struct). This may prove more efficient, but
we have to be more careful with this approach.

--