Array append performance

Fri Aug 22 18:56:46 PDT 2008

I just ran into this yesterday porting C++ code that used push_back() heavily to
D.  Ended up rewriting the inner loops to pre-allocate.  The D code is now
actually faster than the C++ code, but the inner loops are less readable.  For a
whopping overhead of 4 extra bytes per array (on 32-bit), why not include the
capacity field in the array itself?  I think the problem here is the need to call
gc.capacity(), which can require a lock, as Sean said.  If you have the capacity
field embedded in the array itself, you never need a lock for ~= unless the array
actually needs to be realloced or is appended to by more than one thread.  I've
written classes before that need fast appending and therefore cache the capacity
value every time the array is reallocated, and it makes it a ton faster even in
single threaded code.  On the other hand, this breaks the ABI for D arrays, but D2
breaks a lot of backwards compatibility anyhow, so maybe this is still a good
idea.  Does anyone actually write C functions that rely on this ABI and then call
them from D?  Furthermore, having a capacity field would also allow reserve for
builtin arrays.

Other than the 4 bytes of extra overhead, does anyone see any downside to
including a capacity field?  Also, is there some situation I'm not thinking of
where the 4 bytes actually matter more than the speed gains in question?