dynamic array capacity

Wed Dec 29 12:33:09 PST 2010

On Wed, 29 Dec 2010 14:49:35 -0500, spir <denis.spir at gmail.com> wrote:

> On Wed, 29 Dec 2010 13:48:48 -0500
> "Steven Schveighoffer" <schveiguy at yahoo.com> wrote:
>
>> On Wed, 29 Dec 2010 13:14:29 -0500, spir <denis.spir at gmail.com> wrote:
>>
>> > I've done some timings using reserve and Appender. Seems not to work  
>> on
>> > my use case (decomposition of a string [actually a sequence of code
>> > points] according to NFD). (see sample code below)
>> > * use reserve (to source string's length) with builtin append  
>> (operator
>> > '~=') --> 20% slower
>> > * use Appender w/o reserve --> 3 times slower
>> > * user Appender + its own reserve --> 1.5 times slower (i.e. divide
>> > above time per 2)
>>
>> What is the baseline for this?  I.e. what is it 20% slower than? FWIW,
>> Appender should be much faster than builtin append, even without  
>> reserve.
>
> The baseline is builtin append (~=) without reserving. (Sorry, thought  
> it was clear.)

Then I would be suspect of your test.  In simple tests I just ran,  
reserving outperforms normal appending (though not by a huge amount).  If  
it's *slower*, then there's something else wrong, as the worst case is it  
degenerates to non-reserved appending, since the same code paths are  
followed.

>
>> However, Appender has a recently fixed bug (not fixed in 2.051) where
>> appending *arrays* of elements goes very slow.  I see you are doing that
>> in a couple spots.
>
> As explained below, I'm doing _only_ that. So (as guessed), seems it may  
> be the reason why Appender performs so badly in my case.

Yes, most likely.  I mean it was like 10x slower in tests I ran.

Try this version of std.array (shouldn't need to recompile phobos, it's a  
template):  
http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/array.d?rev=2238&format=raw

>
>> > I'm surprised that reserve does not speed up builtin appending, since  
>> it
>> > can only avoid numerous reallocations. How to interpret that?
>
> * This question remains. ??? *

I would lean towards pilot error (sorry), since simple tests show reserve  
does speed things up.  I don't have a lot of time to look into your code  
in depth.  If you can show a simple tests that proves otherwise, I'd be  
happy to look at it.

>> Simple tests I run prove that this is true.  Recent developments in how
>> the appending array grows mitigate this quite a bit, but it certainly  
>> will
>> result in less memory being consumed, and it always runs faster.
>
> I can upload the modified version using reserve & Appender (with the  
> timing module) if you like. (The original one is at  
> https://bitbucket.org/denispir/denispir-d/src, files pile.d and  
> chrono.d.)

I use the 'time' command to run benchmarks usually because I've been  
burned in the past with using custom-built timing mechanisms that have  
flaws in them.

> Anyway, why reserve does not help builtin append is a mystery. In the  
> worst case, it should not trouble.

I'll augment that and say "in your application", since it seems to perform  
expectedly in my tests.

> By the way, how comes I do not need to import anything to be able to use  
> reserve (like if it were a builtin prop)?

It's in object.di, always imported.

I'm wondering, if those should be added to the 'properties' page of the  
spec.

> About Appender, I have had a look at the code in std.array and could not  
> find any sign of why -- except that appending a slice loops over its  
> elements --calling put() for each one--, but this should not be _that_  
> slow, I guess. (Or should it?) Maybe a specialisation of put to take an  
> array (or slice) would help? (Instead of using an array interface, as is  
> done now.)
> The details of realloc are beyong my competence (e.g. the computation of  
> resizing ratio), so I can hardly search further.

The issue is that appending was not amortized when appending arrays vs.  
appending individual elements (where appending *is* amortized).  In order  
to achieve the appearance of O(1) appending, you must grow the array  
proportionally instead of linearly.

In simple terms, you want to essentially double the capacity whenever you  
realloc.  The actual aglorithm uses a logarithmic growth curve that  
settles around 1.4x.

The flawed appender code was just adding enough space to accommodate the  
data being appended, but only if an array was appended (appending single  
elements was using the growth curve).  Not only that, but it wasn't taking  
advantage of the ability to extend into consecutive pages without having  
to move data around.

-Steve