Appender is ... slow

Fri Aug 15 01:27:17 PDT 2014

> Quick test...

Ah, thanks a lot Jonathan. I kept telling me I should probably test it
on a simple case.
OK, the good news is, Appender works in these cases (I mean, that's
good news for Phobos).
Now, I just have to find out why it's slower in my case :)

> ----------------
> import std.array;
> import std.datetime;
> import std.stdio;
>
> enum size = 1000;
>
> void test1()
> {
>     auto arr = appender!(int[])();
>     foreach(n; 0 .. size)
>         arr.put(n);
> }

I used ~= to append to an Appender. The examples use .put, but ~= is
documented so I considered the operator was just syntax sugar. I
didn't have a look at its implementation.

> void test2()
> {
>     int[] arr;
>     foreach(n; 0 .. size)
>         arr ~= n;
> }
>
> void test3()
> {
>     auto arr = new int[](size);
>     foreach(n; 0 .. size)
>         arr[n] = n;
> }

This one is simple and interesting. In my case I don't know the length
in advance (I'm doing some parsing and predicting the size of the
parse forest based on the input length is somewhat tricky), but I can
pre-allocate some, based on a simple heuristics.

> void test4()
> {
>     auto arr = uninitializedArray!(int[])(size);
>     foreach(n; 0 .. size)
>         arr[n] = n;
> }

Oh, I didn't know this one.

> With size being 1000, I get
>
> 178 ms, 229 μs, and 4 hnsecs
> 321 ms, 272 μs, and 8 hnsecs
> 27 ms, 138 μs, and 7 hnsecs
> 24 ms, 970 μs, and 9 hnsecs

Same here, good.
Using ~= n instead of .put(n) gives me results consistently slightly
slower for Appender (170 ms -> 190 ms, repeatedly, even going back and
forth between the two possibilities.
I created a test1prime to test that.

> With size being 100,000, I get
>
> 15 secs, 731 ms, 499 μs, and 1 hnsec
> 29 secs, 339 ms, 553 μs, and 8 hnsecs
> 2 secs, 602 ms, 385 μs, and 2 hnsecs
> 2 secs, 409 ms, 448 μs, and 9 hnsecs

Ditto. OK, that's good. Also, it's still slower with using Appender ~=
n instead of Appender.put. (18s instead of 15, 20% slower)
So, kids: don't do that.

> So, it looks like using Appender to create an array of ints is about twice
> as fast as appending to the array directly, and unsurprisingly, creating the
> array at the correct size up front and filling in the values is far faster
> than either, whether the array's elements are default-initialized or not.
> And the numbers are about the same if it's an array of char rather than an
> array of int.

Perfect. That also means my go-to method will still be using standard
arrays, but with pre-allocation.
I just feel stupid writing that, because it's obvious that reserving
things should give it better speed.

> Also, curiously, if I add a test which is the same as test2 (so it's just
> appending to the array) except that it calls reserve on the array with size,
> the result is almost the same as not reserving. It's a smidgeon faster but
> probably not enough to matter. So, it looks like reserve doesn't buy you
> much for some reason. Maybe the extra work for checking whether it needs to
> reallocate or whatever fancy logic it has to do in ~= dwarfs the extra cost
> of the reallocations? That certainly seems odd to me, but bizarrely, the
> evidence does seem to say that reserving doesn't help much. That should
> probably be looked into.

Yeah, I get a small boost of 5% compared to not reserving at size
1000, which completely disappears for longer arrays.
(No different for size 100_000).

> In any case, from what I can see, if all you're doing is creating an array
> and then throwing away the Appender, it's faster to use Appender than to
> directly append to the array.

Indeed.

> Maybe that changes with structs? I don't know.

I'm using classes, because what I'm pushing into the Appender are
graph nodes and I got fed up with tracing pointer to strucs
everywhere. Maybe I should change back to structs and see what it
does.

> It would be interesting to know what's different about your code that's
> causing Appender to be slower, but from what I can see, it's definitely
> faster to use Appender unless you know the target size of the array up
> front.

Good conclusion. Thanks for your help. My takeaway idea is that I'll
use arrays, but 'new T[](size)' them before. If that makes heavy
concatenation 10 times faster, it should have a positive effect (I'm
not of course waiting for anything near a 10x boost in my computation
time).