Builtin array and AA efficiency questions

Fri Oct 16 09:03:31 PDT 2015

On 10/15/15 5:48 PM, Random D user wrote:
> Ah missed your post before replying to H.S. Teoh (I should refresh more
> often).
> Thanks for reply.

You're welcome!

>
> On Thursday, 15 October 2015 at 19:50:27 UTC, Steven Schveighoffer wrote:
>>
>> Without more context, I would say no. assumeSafeAppend is an
>> assumption, and therefore unsafe. If you don't know what is passed in,
>> you could potentially clobber data.
>>
>> In addition, assumeSafeAppend is a non-inlineable, runtime function
>> that can *potentially* be low-performing.
>
> Yeah I know that I want to overwrite the data, but still that's probably
> a lot of calls to assumeSafeAppend. So I agree.

Even if you know that you want to overwrite the data, you may not want 
to overwrite the data *after* the data :)

A difficult "correct" mechanism is to be handed a buffer to utilize 
first, and then if more buffer is needed, appending into the block or 
beyond. Because the array slice doesn't contain enough information, you 
need something else. Something with both capacity and length.

A typical mechanism may be to pass a stack-allocated buffer that is 
"good enough" for most cases, but then is reallocated onto the GC if you 
need more. This is not easy to make work automatically (would be a nice 
addition to phobos to have something like this). Generally, I end up 
with some inner functions which handle the details.

> What does marked for appending mean. How does it happen or how is it
> marked?

See Mike's post and my reply

> So assumeSafeAppend is only useful when I have array whose length is set
> to lower than it was originally and I want to grow it back (that is
> arr.length += 1 or arr ~= 1).

Right, an array *by default* supports appending in place when it knows 
that the data in the block beyond the current array is unused. The 
assumeSafeAppend is saying "yes, I know that appending will clobber data 
beyond the array if you append in place, but it's safe because I don't 
care about that data any more". The runtime doesn't know that data isn't 
important until you tell it.

> Related to my question above.
> How do you get a block not marked for appending? a view slice?

The terms are confusing. A slice is not a block, it's a view of a block. 
So the block itself is marked as appendable (Technically, this is 
because the "used" field of the block is stored within the block. If I 
didn't have the flag, I may be reading a field that is garbage as the 
"used" size, and potentially make bad decisions). The only way you 
should get a block with an APPENDABLE flag set is via an array 
allocation, which correctly initializes the block.

> Perhaps I should re-read the slice article. I believe it had something
> like capacity == 0 --> always allocates. Is it this?

capacity == 0 means any append to that slice will allocate. But this is 
not the flag :) It means either a) the slice does not point at an 
appendable GC block, or b) it does point at such a block, but there is 
used data after the slice.

the b) situation can be changed by calling assumeSafeAppend (only if you 
are sure the data after isn't valuable).

>> So when to call really sort of requires understanding what the runtime
>> does. Note it is always safe to just never use assumeSafeAppend, it is
>> an optimization. You can always append to anything (even non-GC array
>> slices) and it will work properly.
>
> Out of curiosity. How does this work? Does it always just reallocate
> with gc if it's allocated with something else?

Yes.

> Usually I just want to avoid throwing away memory that I already have.
> Which is slow if it's all over your codebase.
> Like re-reading or recomputing variables that you already have.
> One doesn't hurt but a hundred does.

Right, this is a very useful mechanism, and can save performance cost 
quite a bit.

-Steve