Enhanced array appending

Wed Dec 23 06:53:07 PST 2009

Steven Schveighoffer wrote:
> grauzone Wrote:
> 
>> Steven Schveighoffer wrote:
>>> All,
>>>
>>> I created a new patched druntime that prevents array stomping and at the 
>>> same time increases append performance for thread-local array appending.
>>>
>>> The patch is attached to bugzilla entry 3637. ( 
>>> http://d.puremagic.com/issues/show_bug.cgi?id=3637 )
>> Nobody cares about it? (Except someone commenting in bugzilla, but 
>> where's Andrei and Walter?)
> 
> I have already discussed this patch with Andrei and Walter (and the rest of the compiler/phobos crew), they are on board.  I am a little disappointed with the response here, but I guess it could be attributed to the barrier to experimentation (you have to apply the patch and compile the runtime) and the likelihood that many are on vacation right now (as I will be as of tonight).

Now Andrei replied and said everyone was "excited".

I'm expecting most people to wait until there's a dmd release with the 
patch included (I'll do that).

>> Also, what again were the arguments against adding a "capacity" field in 
>> the slice struct to deal with the append problem?
> 
> The problem is that adding a capacity field only works if the object is a reference type.  A value type will make either choose to make copies of the capacity field, resulting in the same stomping problem (and maybe worse), or will zero out one of the copy's capacity field, resulting in an artificial limitation for appending.  The only true way to make this work right is to have a capacity field shared among all the aliases to the same data, and the most logical place for that is in the allocated block.

I see... Go makes slices value types (which references the array data), 
but I'm not sure how or if they handle this situation.

Then, why use a cache to find the pointer to the capacity field inside 
the array memory block? You could just store a pointer to the capacity 
field in the slice.

Something like this, I guess:

struct slice {
     void* ptr;
     size_t length;
     array_info* array;
}

//can be located at the start of an array memory block
struct array_info {
     size_t max_length;
     size_t capacity;
}

void slice_set_length(ref slice s, size_t newlen) {
     if (newlen < s.length) {
         s.length = newlen;
         return;
     }

     //was not allocated from the GC?
     if (!s.array)
         goto new_array;

     //danger of stomping?
     if (s.length != s.array.max_length)
         goto new_array;

     //memory block too small?
     if (newlen > s.array.capacity)
         goto new_array;

     s.length = newlen;
     s.array.max_length = max(s.array.max_length, s.length);

     return;

new_array:
     ... allocate a memory block with array descriptor
     ... copy s[] into it
     ... set s to new array pointer, length, array descriptor
}

Would this work? (Except that I didn't include handling of array element 
default initialization.)

Actually, I think that's the same idea (as Andrei's original post on the 
array append cache thing), just much shorter and no indeterminism due to 
a volatile cache.

> -Steve