assumeSafeAppend inconsistent when multiple slices in use

Thu Apr 7 12:22:10 PDT 2011

On 05.04.2011 22:04, Steven Schveighoffer wrote:
> On Tue, 05 Apr 2011 15:24:46 -0400, simendsjo <simen.endsjo at pandavre.com>
> wrote:
>
>> I don't know if this is an actual problem, but I don't understand the
>> behavior.
>> When one slice calls assumeSafeAppend, both slices is "given control",
>> that is, gets the parents capacity.
>
> You have to stop thinking of array slices as unique :)

Ok, done.

> If you look at the dissection of a slice, it is a pointer and length.
> That is all. So if b.ptr == c.ptr && b.length == c.length, then both
> slices not only are the same, but are identical. That is, the runtime
> cannot tell the difference.
>
> It might be easier to think about it this way. First, there are no
> arrays in the traditional definition, only slices. Every array slice
> which references the heap is backed by a memory block. The block has a
> fixed size (e.g. 16 bytes, 32 bytes, 64 bytes, etc.). However, encoded
> in the block itself is the 'used length', which is basically the number
> of bytes in the block which are actually being used. If you took a slice
> of the block from the beginning of the block and with length equivalent
> to the 'used length', we can call this the 'master slice'. The array
> appending code allows appending ONLY when the slice being appended ends
> at the same point that the master slice ends (and when there is space
> available).
>
> all assumeSafeAppend does is adjust that "used length" so it ends at the
> same point as the given slice. So it's not actually blessing anything
> about the specific slice you use to call it with, it's just adjusting
> the 'master slice' to make it so the array you give will be appendable.
>

(snip)

>
> If I have thoroughly confused you now, please ask more questions :)

A very thorough explanation which changes my view of arrays. I thought 
slices, array stomping protection and reallocation had very complicated 
rules (and perhaps it has under the hood..?)

I tried to look at assumeSafeAppend's code, and although I didn't 
understand everything, I think that together with your great explanation 
I got it, but lets see...
I'm having some problems explaining it as I'm not sure about the correct 
terminology (I should have drawn some pictures..), but I'll try.

The runtime doesn't track the "original" array and the slices (for this 
example at least - it obviously do for the sake of gc).
Internally it keeps an address to the current endpoint in the chunk of 
memory like array.ptr+array.length.
Any array that ends on this address is allowed to append fill the memory 
without reallocating as it points to the end of the array. But whenever 
a slice that's allowed to append/change length changes, it will change 
the endpoint address for the chunk of memory, and thus the other slices 
probably points at an earlier element and cannot append.
Any element not pointing at the endpoint will return capacity==0.

There is one thing I don't understand though..
 > However, a does not end on the master slice, so its capacity is 0. Note
 > that technically using a[1..$] is undefined at this point since the
 > runtime considers those elements 'unused'.

What do you mean be undefined in this context? Do you mean because it 
can be overwritten like this?

int[] a = [1,2];
int[] b = a[0..1];
b.assumeSafeAppend();
b.length += 1;
assert(a == [1,0]); // a[1] overwritten as int.init is called on the element

Or is it because it's outside what the runtime considers part of the 
used array and is free to do what it want's with it?
Or doesn't the gc track slices with an endpoint greater than the current 
endpoint so it might return memory to the OS?