Spikes in array capacity

Fri Jul 2 07:01:11 PDT 2010

Steven Schveighoffer wrote:
 > On Fri, 02 Jul 2010 01:34:26 -0400, Ali Çehreli <acehreli at yahoo.com> 
wrote:
 >
 >> BCS wrote:
 >>
 >>  >> There is something strange in the array capacity algorithm.
 >>  >> [...]
 >>  >> Is that intended?
 >>
 >>  > I'm going to take a guess that the gc is (after some point) 
allocating
 >>  > into the smallest hole with "enough" capacity.
 >>
 >> You may be right. :)
 >>
 >>  > If you re run the test
 >>  > but with something else going on to add some noise, do the spikes
 >> move?
 >>
 >> Yes, the spikes move.
 >>
 >>  > void makeNoise()
 >>  > {
 >>  >      static byte[][256] data;
 >>  >      data[rand() % $] = new byte[rand() % 512];
 >>  > }
 >>
 >> I am not familiar with the dmd internals; but following the code took
 >> me to the function newCapacity() in src/druntime/src/rt/lifetime.d.
 >> The capacity growth is more complicated than what I've been assuming
 >> so far.
 >>
 >> And yes, GC is in the picture. Quoting the comments from that file:
 >>
 >> /*
 >>   * Better version by Dave Fladebo:
 >>   * This uses an inverse logorithmic algorithm to pre-allocate a bit 
more
 >>   * space for larger arrays.
 >>   * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most
 >>   * common cases, memory allocation is 1 to 1. The small overhead added
 >>   * doesn't affect small array perf. (it's virtually the same as
 >>   * current).
 >>   * - Larger arrays have some space pre-allocated.
 >>   * - As the arrays grow, the relative pre-allocated space shrinks.
 >>   * - The logorithmic algorithm allocates relatively more space for
 >>   * mid-size arrays, making it very fast for medium arrays (for
 >>   * mid-to-large arrays, this turns out to be quite a bit faster 
than the
 >>   * equivalent realloc() code in C, on Linux at least. Small arrays are
 >>   * just as fast as GCC).
 >>   * - Perhaps most importantly, overall memory usage and stress on 
the GC
 >>   * is decreased significantly for demanding environments.
 >>   */
 >>
 >> So there may not be a bug after all...
 >>
 >> Ali
 >
 > If your original code is all that was being run, I think there is a bug.

The original code was it.

 > I'll tell you why -- the GC provides a way to append pages to an
 > allocated block.  Essentially, you can glue consecutive unused pages
 > together to make a larger block, thereby growing your block.
 >
 > But anything under a page is simply reallocated because anything less
 > than a page is inside a pool of blocks of the same size, and those
 > cannot be glued together.
 >
 > So when your array is growing, and it gets larger than a page, it should
 > grow a single page at a time.  But I think (and I stress think, I'm not
 > sure) that when a block of pages is deallocated, each page becomes
 > independently free.  They do not stay glued together.  So for an
 > allocating requesting 1 or 2 more pages to grow by 2000 pages is
 > suspect.  I still have to examine the code, but I think there is a 
problem.
 >
 > Back to my original assertion -- to disprove the theory that some other
 > "larger hole" is why it gets so big, in one instance your number of
 > elements grows from 1,155,070 (4MB) to 3,153,917 (12MB), that's almost a
 > 200% increase.  In order for that 12MB hole to exist, something must
 > have allocated it before, and then deallocated it.  It can't be your
 > loop because your loop has not allocated that much space yet.  I would
 > find that very strange to happen somewhere in the runtime without you
 > specifically allocating it.  If however, your original post is not the
 > entire code, then there may be some possibility to that.
 >
 > BTW, you can take the newCapacity function out of the loop by simply
 > setting the length and then writing the last element, i.e.:
 >
 > a.length += 1;
 > a[$-1] = i;
 >
 > newCapacity isn't called when setting the length explicitly, because
 > it's not considered to be an append.

You are right.

Looks like I was successful in locating the culprit. :) I took liberty 
to write ++a.length and also outputted the number of elements that are 
in reserve at each allocation:

import std.stdio;
import std.array;

void main()
{
     int[] a;
     size_t oldCapacity;

     foreach (i; 0 .. 100_000_000) {
         ++a.length;
         a[$-1] = i;

         if (capacity(a) != oldCapacity) {
             writeln("length: ", a.length,
                     " capacity: ", capacity(a),
                     " ratio: ", cast(double)capacity(a) / oldCapacity,
                     " reserve: ", capacity(a) - a.length);
             oldCapacity = capacity(a);
         }
     }
}

Now the spikes are gone, and the allocations asymptote at once-per-1024 
elements for an int array:

length: 1 capacity: 3 ratio: inf reserve: 2
length: 4 capacity: 7 ratio: 2.33333 reserve: 3
length: 8 capacity: 15 ratio: 2.14286 reserve: 7
length: 16 capacity: 31 ratio: 2.06667 reserve: 15
length: 32 capacity: 63 ratio: 2.03226 reserve: 31
length: 64 capacity: 127 ratio: 2.01587 reserve: 63
length: 128 capacity: 255 ratio: 2.00787 reserve: 127
length: 256 capacity: 509 ratio: 1.99608 reserve: 253
length: 512 capacity: 1021 ratio: 2.00589 reserve: 509
length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023
length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023
length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023
...

In the long run, there is one allocation per 1024 elements. That still 
feels like amortized constant, but considering that relocation times 
would be O(N), I think reserve should still grow with N, but maybe not 
too much. (I am not sure about this last point at all.)

 > -Steve

Thank you very much,
Ali