Enhanced array appending

Thu Dec 24 10:39:21 PST 2009

Steven Schveighoffer wrote:
>> I see no problem with adding an additional field to usual slices. 
>> There's also some benefit: clearer semantics (if you make guaranteed 
>> performance part of the semantics) and a much simpler runtime.
> 
> Simpler runtime is not any benefit IMO.  Nobody cares how array appending works, they just know the semantics.  The semantics wouldn't be any different with your type, you still have all the same semantics, so I don't really get that point.

Maybe...

>>> Your idea is essentially the idea behind my patch, except you can get the array info without having an additional pointer.  The cost of lookup for the block info is mitigated by the cache, making lookups cost very little.
>> But the cache is volatile, and a cache entry could be easily flushed by 
>> doing many array append operations.  A single string processing function 
>> could replace the entire cache with entries for string temporaries, and 
>> cause nasty performance regressions in unrelated parts of the program 
>> (e.g. in places where you really don't want each append operation to 
>> reallocate the full array).
> 
> I'm not sure you understand the semantics.  When an array's block info is in the cache, no lookup of the block info from the GC is required, so the lock isn't taken, and the lookup is as simple as a linear search through 8 elements.  But when it is not in the cache, it is looked up via the GC, requiring a lock and a look through all the memory pools (I'm unsure of the performance, but it clearly is slower).  However, a cache miss does not mean an automatic reallocation, it just means the lookup of the block info is slower.  It can *still* be extended into the block.  Basically, the algorithm performance of appending with a cache miss is the same as a cache hit, just with a larger constant.

Sorry about that, I didn't have a close look at the patch. I guess I was 
more thinking about Andrei's original proposal (and how I thought it may 
be implemented).

It seems you store the length field inside the array's memory block 
(instead of the cache, which just speeds up gc_query). That's awesome, 
but I'm going to complain again: now you have to keep a length field for 
*all* memory allocations, not only arrays! For most object allocations, 
this means 4 more bytes additional overhead. Also, if you use GC.malloc 
directly, and the user tries to append to slices to it, your code may 
break. GC.malloc doesn't seem to pad the memory block with a length field.

I must say that I find your argumentation strange: didn't you say adding 
an additional field to the slice struct is too much overhead?

Also, a solution which keeps the pointer to the array length in the 
slice struct would still be faster. The cache lookup cost is not zero.

Anyway, now that semantics and performance are somewhat sane, none of 
these remaining issues are too important. Thanks Steve and Andrei!

>> The worst is, this behavior will be 
>> relatively indeterministic. It's also hard to explain to "normal" 
>> programmers.
>>
>> Also, isn't it completely useless? If the programmer wants to append to 
>> an array, he'll use an append struct just to be sure, and if he doesn't 
>> want to append, inefficient array slice appending wouldn't be a problem.
> 
> Again, I think these points may be fostered by a misunderstanding of the patch.  The performance is much better than current performance, and people have had no problem dealing with the current runtime except in special circumstances.
> 
> -Steve