Array append performance (Was: Re: Lisp like lists and a problem with TANGO fromStringz)

Mon Mar 3 10:05:42 PST 2008

bearophile wrote:
> To: Newsgroups: digitalmars.D
> Subject: Re: Lisp like lists and a problem with TANGO fromStringz
> 
> Oskar Linde>Here is a way to reserve space for arrays:<
> 
> I did try a function equal to that one, but have you timed it? My timing results don't show any advantage of using such reserve.

There are more advantages to do it than speed. Memory fragmentation and 
wastage for example.

<snip>

> OUTPUT TIMINGS:
> 
> DMD 1.026, N = 12_000_000:
>   append:            3.3 s
>   reserve + append:  3.3 s
>   allocate + assign: 1.0 s

It is hard to draw too many conclusions from such a simple test. For one 
thing, the DMD GC will on array appends try to grow the allocated GC 
chunk without reallocating. In an application such as this (with 
virtually no memory fragmentation), there is a good chance that it is 
able to do that almost all the time.

DMD/GDC also seems unable to inline the _d_arrayappendcTp call which 
would be one reason for the rather poor performance overall.

There are cases where the reallocation helps though, even though the 
performance overall is embarrassing compared to not appending.

GDC 0.24, N = 120_000_000:
    append:            9.3 s
    reserve + append:  5.3 s
    allocate + assign: 1.3 s

(Also note that append in this case wastes 168 MB memory compared to 
reserve+append (*))

It is also slightly faster for small arrays, where the GC bucket 
limitation prevents in place growing from happening:

GDC 0.24, N = 512:
    append:            26  µs
    reserve + append:  23  µs
    allocate + assign: 1.7 µs

----

*) There seems to be a bug (or it might be by design) that the GC on 
appending to an array sometimes allocates up to 3x the required memory 
(to me, it seems anything above 2x is seriously overkill). The line in 
question is in internal/gc/gc.d in Phobos, where the correct line is 
commented out:

//long mult = 100 + (1000L * size) / (6 * log2plus1(newcap));
long mult = 100 + (1000L * size) / (log2plus1(newcap));

So, for example, appending one int to a 1MB int[] array, could result in 
an array with 3MB reserved space. Or appending a single int to a 74MB 
array results in a 184 MB array. Something to look out for.

-- 
Oskar