memory management of array

Fri Dec 19 08:43:49 PST 2008

== Quote from ZHOU Zhenyu (rinick at goozo.net)'s article
> This is a multi-part message in MIME format
> ------------19106828741229702115
> Content-Type: text/plain
> I use D to solve the programming problems  on projecteuler (
http://projecteuler.net/ )
> I thought D should be as fast as C++, but it turns out that sometimes D is much
slower.
> It seems that array would reallocate its memory every time the  ~= operation is
invoked.
> If I push 10^7 numbers into an array, the realloc would be called for 10^7 times.
> That would be at least 100 times slower than C++STL.
> So I don't use ~= and manage the length of the array myself
> When the length is not enough, I reallocate the memory with the size of
requiredSize * 9 / 8. ( In STL, it's oldSize *  2. )
> 9/8 could save a lot of memory, and the attached source code shows that *9/8 is
almost as fast as *2 .
> My question is:
> Is there any other way to solve this problem?
> I think this kind of optimization should be implemented in the compiler or the D
runtime.

Internally, dynamic arrays in D are represented by something like:

struct Array(T) {
    T* ptr;
    size_t length;
}

When the ~= operator is invoked, there is no fast way to get the capacity of the
array.  Therefore, the GC is queried for this information.  This can be very slow
since, if the program is multithreaded, interacting with the GC means waiting for
a lock.  Furthermore, even in single-threaded programs, this is not particularly
efficient.  The GC (I believe, I haven't looked at that part of the code since the
old Phobos GC was replaced with druntime) somewhat compensates by caching the
capacity of the last pointer queried, and this helps to some degree as long as
you're only appending to one array in a single-threaded program.

There are basically three solutions to this, each with its pros and cons:

1.  Allocate the whole array at once and keep track of the last position used.  If
you know in advance how big it is going to be, or at least a reasonable upper
bound, and the code is performance-critical, this is a good idea anyway.  However,
it's a bad solution if you don't know this.
2.  Use something like an ArrayBuilder struct that caches the capacity.  The core
of the code would be something like:

void opCatAssign(T elem) {
    if(array.length + 1 < capacity) {
        array = array.ptr[0..$ + 1];
        array[$ - 1] = elem;
    } else {
        array ~= elem;
        capacity = GC.sizeOf(array.ptr) / T.sizeof;
    }
}

The problems with this are that it's a little bit ugly and that there's no
standard ArrayBuilder struct.  People have written them before, and it's very easy
to write your own just the way you like it, but a standard version in Phobos/Tango
would be nice.

3.  Include the capacity field in the internal array struct.  The problems with
this are that it would require changes to the compiler and the ABI and that it
would be inefficient in a lot of cases where the appending functionality is not used.