Spikes in array capacity
Steven Schveighoffer
schveiguy at yahoo.com
Fri Jul 2 08:21:13 PDT 2010
On Fri, 02 Jul 2010 10:01:11 -0400, Ali Çehreli <acehreli at yahoo.com> wrote:
> Steven Schveighoffer wrote:
> > newCapacity isn't called when setting the length explicitly, because
> > it's not considered to be an append.
>
> You are right.
>
> Looks like I was successful in locating the culprit. :) I took liberty
> to write ++a.length and also outputted the number of elements that are
> in reserve at each allocation:
>
> import std.stdio;
> import std.array;
>
> void main()
> {
> int[] a;
> size_t oldCapacity;
>
> foreach (i; 0 .. 100_000_000) {
> ++a.length;
> a[$-1] = i;
>
> if (capacity(a) != oldCapacity) {
> writeln("length: ", a.length,
> " capacity: ", capacity(a),
> " ratio: ", cast(double)capacity(a) / oldCapacity,
> " reserve: ", capacity(a) - a.length);
> oldCapacity = capacity(a);
> }
> }
> }
>
> Now the spikes are gone, and the allocations asymptote at once-per-1024
> elements for an int array:
>
> length: 1 capacity: 3 ratio: inf reserve: 2
> length: 4 capacity: 7 ratio: 2.33333 reserve: 3
> length: 8 capacity: 15 ratio: 2.14286 reserve: 7
> length: 16 capacity: 31 ratio: 2.06667 reserve: 15
> length: 32 capacity: 63 ratio: 2.03226 reserve: 31
> length: 64 capacity: 127 ratio: 2.01587 reserve: 63
> length: 128 capacity: 255 ratio: 2.00787 reserve: 127
> length: 256 capacity: 509 ratio: 1.99608 reserve: 253
> length: 512 capacity: 1021 ratio: 2.00589 reserve: 509
> length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023
> length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023
> length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023
> ...
>
> In the long run, there is one allocation per 1024 elements. That still
> feels like amortized constant, but considering that relocation times
> would be O(N), I think reserve should still grow with N, but maybe not
> too much. (I am not sure about this last point at all.)
>
4x1024 = 4096 == page size.
The newCapacity function is supposed to reduce the effect of this
phenomenon. i.e., when appending, it's assumed you are going to continue
appending, so to reduce the overhead, the amount added to the memory block
is supposed to be related to the total number of bytes already allocated.
It doesn't seem like the function is working very well... It could have
been something I did, or it could have always been broken :) I honestly
did not examine the code at all, I just only ever read the comments,
assuming it works.
Please add these comments to the bug report, it will help when I look at
it. I don't have a lot of time to look at it right now, so I'm afraid
I'll forget what we were doing later :)
-Steve
More information about the Digitalmars-d
mailing list