Appender is ... slow

Thu Aug 14 14:00:54 PDT 2014

On Thursday, 14 August 2014 at 19:47:33 UTC, Brad Anderson wrote:
> On Thursday, 14 August 2014 at 19:10:18 UTC, Jonathan M Davis 
> wrote:
>> I've never really tried to benchmark it, but it was my 
>> understanding that the idea behind Appender was to use it to 
>> create the array when you do that via a lot of appending, and 
>> then you use it as a normal array and stop using Appender. It 
>> sounds like you're trying to use it as a way to manage reusing 
>> the array, and I have no idea how it works for that. But then 
>> again, I've never actually benchmarked it for just creating 
>> arrays via appending. I'd just assumed that it was faster than 
>> just using ~=, because that's what it's supposedly for. But 
>> maybe I just completely misunderstood what the point of 
>> Appender was.
>>
>> - Jonathan M Davis
>
> I too have trouble understanding what Appender does that 
> supposedly makes it faster (at least from the documentation). 
> My old, naive thought was that it was something like a linked 
> list of fixed size arrays so that appends didn't have to move 
> existing elements until you were done appending, at which point 
> it would bake it into a regular dynamic array moving each 
> element only once looking at the code it appeared to be nothing 
> like that (an std::deque with a copy into a vector in c++ 
> terms).
>
> Skimming the code it appears to be more focused on the much 
> more basic "~= always reallocates" performance problem. It 
> seems it boils down to doing essentially this (someone feel 
> free to correct me) in the form of an output range:
>
> auto a = /* some array */;
> auto b = a;
> a = a.array();
> for(...)
>   b.assumeSafeAppend() ~= /* element */;

It was my understanding that that was essentially what it did, 
but I've never really studied its code, and I don't know if there 
are any other tricks that it's able to pull. It may be that it 
really doesn't do anything more than be  wrapper which handles 
assumeSafeAppend for you correctly. But if that's the case, I 
wouldn't expect operating on arrays directly to be any faster. 
Maybe it would be _slightly_ faster, because there are no wrapper 
functions to inline away, but it wouldn't be much faster, it 
would ensure that you used assumeSafeAppend correctly. If it's 
really as much slower as Phillippe is finding, then I have no 
idea what's going on. Certainly, it merits a bug report and 
further investigation.

> (assumeSafeAppend's documentation doesn't say whether or not 
> it'll reallocate when capacity is exhausted, I assume it does).

All assumeSafeAppend does is tell the runtime that this 
particular array is the array the farthest into the memory block 
(i.e. that any of the slices which referred to farther in the 
memory block don't exist anymore). So, the value in the runtime 
that keeps track of the farthest point into the memory block 
which has been referred to by an array is then set to the end of 
the array that you passed to assumeSafeAppend. And because it's 
now the last array in that block, it's safe to append to it 
without reallocating. But the appending then works the same way 
that it always does, and it'll reallocate when there's no more 
capacity. The whole idea is to just make it so that the runtime 
doesn't think that the memory after that array is unavailable for 
that array to expand into.

- Jonathan M Davis