Builtin array and AA efficiency questions

Thu Oct 15 12:50:27 PDT 2015

On 10/15/15 12:47 PM, Random D user wrote:
> So I was doing some optimizations and I came up with couple basic
> questions...
>
> A)
> What does assumeSafeAppend actually do?
> A.1) Should I call it always if before setting length if I want to have
> assumeSafeAppend semantics? (e.g. I don't know if it's called just
> before the function I'm in)

Without more context, I would say no. assumeSafeAppend is an assumption, 
and therefore unsafe. If you don't know what is passed in, you could 
potentially clobber data.

In addition, assumeSafeAppend is a non-inlineable, runtime function that 
can *potentially* be low-performing. If, for instance, you call it on a 
non-GC array, or one that is not marked for appending, you will most 
certainly need to take the GC lock and search through the heap for your 
block.

The best place to call assumeSafeAppend is when you are sure the array 
has "shrunk" and you are about to append. If you have not shrunk the 
array, then the call is a waste, if you are not sure what the array 
contains, then you are potentially stomping on referenced data.

Calling it just after shrinking every time is possible, but could 
potentially be sub-optimal, if you don't intend to append to that array 
again, or you intend to shrink it again before appending.

> A.2) Or does it mark the array/slice itself as a "safe append" array?
> And I can call it once.

An array uses a block marked for appending, assumeSafeAppend simply sets 
how much data is assumed to be valid. Calling assumeSafeAppend on a 
block not marked for appending will do nothing except burn CPU cycles.

So yours is not an accurate description.

> A.3) If A.2 is true, are there any conditions that it reverts to
> original behavior? (e.g. if I take a new slice of that array)

Any time data is appended, all references *besides* the one that was 
used to append now will reallocate on appending. Any time data is shrunk 
(i.e. arr = arr[0..$-1]), that reference now will reallocate on appending.

So when to call really sort of requires understanding what the runtime 
does. Note it is always safe to just never use assumeSafeAppend, it is 
an optimization. You can always append to anything (even non-GC array 
slices) and it will work properly.

> I read the array/slice article, but is seems that I still can't use them
> with confidece that it actually does what I want. I tried also look into
> lifetime.d, but there's so many potential entry/exit/branch paths that
> without case by case debugging (and no debug symbols for phobos.lib)
> it's bit too much.

I recommend NOT to try and understand lifetime.d, it's very complex, and 
the entry points are mostly defined by the compiler. I had to use trial 
and error to understand what happened when.

> What I'm trying to do is a reused buffer which only grows in capacity
> (and I want to overwrite all data). Preferably I'd manage the current
> active size of the buffer as array.length.
>
> For a buffer typical pattern is:
> array.length = 100
> ....
> array.length = 0
> ....
> some appends
> ....
> array.length = 50
> ....
> etc.

This is an easy call then:

array.reserve(100); // reserve 100 elements for appending
array ~= data; // automatically manages array length for you, if length 
exceeds 100, just automatically reallocates more data.
array.length = 0; // clear all the data
array.assumeSafeAppend; // NOW is the best time to call, because you 
can't shrink it any more, and you know you will be appending again.
array ~= data; // no reallocation, unless previous max size was exceeded.

> B.1) I have a temporary AA whose lifetime is limited to a known span
> (might be a function or a loop with couple functions). Is there way to
> tell the runtime to immeditially destroy and free the AA?

There isn't. This reminds me, I have a lingering PR to add aa.clear 
which destroys all the elements, but was waiting until object.clear had 
been removed for the right amount of time. Perhaps it's time to revive that.

-Steve