assumeSafeAppend inconsistent when multiple slices in use

Tue Apr 5 13:04:22 PDT 2011

On Tue, 05 Apr 2011 15:24:46 -0400, simendsjo <simen.endsjo at pandavre.com>
wrote:

> I don't know if this is an actual problem, but I don't understand the
> behavior.
> When one slice calls assumeSafeAppend, both slices is "given control",
> that is, gets the parents capacity.

You have to stop thinking of array slices as unique :)

If you look at the dissection of a slice, it is a pointer and length.   
That is all.  So if b.ptr == c.ptr && b.length == c.length, then both  
slices not only are the same, but are identical.  That is, the runtime  
cannot tell the difference.

It might be easier to think about it this way.  First, there are no arrays  
in the traditional definition, only slices.  Every array slice which  
references the heap is backed by a memory block.  The block has a fixed  
size (e.g. 16 bytes, 32 bytes, 64 bytes, etc.).  However, encoded in the  
block itself is the 'used length', which is basically the number of bytes  
in the block which are  actually being used.  If you took a slice of the  
block from the beginning of the block and with length equivalent to the  
'used length', we can call this the 'master slice'.  The array appending  
code allows appending ONLY when the slice being appended ends at the same  
point that the master slice ends (and when there is space available).

all assumeSafeAppend does is adjust that "used length" so it ends at the  
same point as the given slice.  So it's not actually blessing anything  
about the specific slice you use to call it with, it's just adjusting the  
'master slice' to make it so the array you give will be appendable.

Keeping all this in mind, let's go through your example to see how  
assumeSafeAppend actually works

> 	int[] a;
> 	a.length = 4;

At this point, the array allocates a block large enough to hold 4  
integers.  You'd think that was 16 bytes, but we need to have a byte to  
store the 'used length', so this actually turns out to be 32 bytes (of  
which 31 can be used to store data).  So a.capacity should be 7.

So what we see here is, a points to the beginning of the block, it's  
length is 4, and the master slice's length is 16 (remember the master  
slice is always in bytes).  For simplicity's sake, we'll assume the master  
slice is an int[] slice, so we'll call the length 4.

>
> 	int[] b = a[0..1];
> 	int[] c = a[0..1];

each of these slices' lengths are 1, which mean they don't end at the same  
point as the master slice.  Therefore, they are not appendable in-place  
(capacity of 0).

>
> 	// appending to b or c reallocates
> 	assert(a.capacity >  0);
> 	assert(b.capacity == 0);
> 	assert(c.capacity == 0);

ok so far.

>
> 	// gives a's capacity to b AND c
> 	b.assumeSafeAppend();

This now sets the master slice to be the same length as b (which is 1).

> 	assert(a.capacity == 0);
> 	assert(b.capacity >  0);
> 	assert(c.capacity >  0); // why does c get capacity?

So now, even though you called assumeSafeAppend on b, both b and c end at  
the same place as the master slice.  So they are both appendable.  The  
runtime cannot tell the difference between b and c, so calling capacity  
with either should be absolutely equivalent.

However, a does not end on the master slice, so its capacity is 0.  Note  
that technically using a[1..$] is undefined at this point since the  
runtime considers those elements 'unused'.

>
> 	// .. until one of the slices modifies. Just changing
> assumeSafeAppend does not change this behavior
> 	b ~= 1;

This increments the master slice's length by 1 and sets the value of the  
new element to 1.

> 	// now b has full control
> 	assert(a.capacity == 0);
> 	assert(b.capacity >  0);
> 	assert(c.capacity == 0);

Since b's length still matches the master slice's length (2), it's still  
appendable.  However, a's length is still 4 and c's length is still 1, so  
neither is appendable.

>
> 	// c full control
> 	c.assumeSafeAppend();
> 	assert(a.capacity == 0);
> 	assert(b.capacity == 0);
> 	assert(c.capacity >  0);

Sets the master slice length to 1.  a's length is 4, b's length is 2, so  
neither are appendable.

>
> 	// a full control
> 	a.assumeSafeAppend();
> 	assert(a.capacity >  0);
> 	assert(b.capacity == 0);
> 	assert(c.capacity == 0);

Sets the master slice length to 4.  b's length is 2, c's length is 1, so  
neither are appendable.

>
> 	// but now ONLY b gets full control. Not both b and c as the
> first assumeSafeAppend
> 	b.assumeSafeAppend();
> 	assert(a.capacity == 0);
> 	assert(b.capacity >  0);
> 	assert(c.capacity == 0);

Right, because b and c are no longer identical.  The call to  
assumeSafeAppend sets the master slice's length to match b's length of 2.   
So now, a and c, which don't have a length of 2, are now not appendable.

Note, if you did b ~= 1; b ~=1; then the master slice's length would equal  
a's length, so a would then become appendable.

There is one more aspect that you have not covered, and that is slices  
which do not *start* at the block front.  Remember, I said any slice that  
*ends* at the same place that the master slice ends.  Slices do not  
necessarily need to begin at the beginning of the block to be appendable.   
All the runtime has to prove is that appending will only expand into  
unused data.

So this is also expected:

int[] a;
a.length = 4;
b = a[2..$];

assert(a.capacity > 0);
assert(b.capacity > 0);

If I have thoroughly confused you now, please ask more questions :)

-Steve