Array append proposal

Steven Schveighoffer schveiguy at yahoo.com
Fri Oct 10 11:42:53 PDT 2008


"Sergey Gromov" wrote
> Fri, 10 Oct 2008 13:30:14 -0400,
> Steven Schveighoffer wrote:
>> "Sergey Gromov" wrote
>> > Fri, 10 Oct 2008 11:11:40 -0400,
>> > Steven Schveighoffer wrote:
>> >> If this bit is set, then the array can be appended to in place if its
>> >> length
>> >> equals the length stored in the heap (and of course, the memory block
>> >> contains enough space).  If not, then a new array is allocated, and 
>> >> the
>> >> data
>> >> copied, the length stored in the heap is left alone.
>> >
>> > Then you'd better keep the memory block size there, too.  It's the
>> > memory block size check which kills performance right now.
>>
>> I'm not trying to fix performance.  I'm trying to fix slice corruption
>> issues (without adversely affecting performance).
>>
>> However, it would be nice to have this stored also.  It's definitely
>> possible, but the overhead might be a bit too much.  Imagine allocating a 
>> 4
>> byte array, and having 8 bytes of overhead.
>
> Garbage collector granularity is 16 so there is no overhead in your
> example.  A 9-byte array would take 2 memory blocks instead of one
> though.

Very true, I didn't think of it that way.  So probably having the capacity 
stored in the heap is acceptable.  It just changes the threshold at which 
the overhead is expensive, 8 bytes instead of 16 (don't forget the sentinel 
byte).

>
>> If it turns out everything needs to be 8-byte aligned (not sure, I'm not
>> really an expert about that kind of stuff), then yes, putting in a 
>> capacity
>> field would be a great addition.
>>
>> >
>> >> * There are some threading issues to look at, but I am nowhere near as
>> >> versed in this as some of you guys.  I'd guess you can do something
>> >> similar
>> >> to Bartosz' thin locks?  Except you would never expand the lock.  The
>> >> first
>> >> time you lose contention, you assume you can't append in place, and
>> >> allocate
>> >> a new memory space.  The only drawback is if the array which grabbed 
>> >> the
>> >> lock decides it will allocate a new array as well, and the second 
>> >> attempt
>> >> to
>> >> append could have done it in place, you lose some performance there. 
>> >> But
>> >> that's probably a rare corner case. The bit that is used for the 'is
>> >> array
>> >> head' can be used for lock contention.  So something like this is 
>> >> stored
>> >> in
>> >> the heap:
>> >>
>> >> bit [31]: set if nobody is trying to expand array, unset if it is
>> >> currently
>> >> being expanded.
>> >> bit [0:30]: the stored length
>> >>
>> >> So in order to expand an array, you compare and swap with your
>> >> struct-local
>> >> length+atBeginning, swapping it for 0.
>> >> *  success: check GC if block can accomodate new data;  if so, store 
>> >> the
>> >> new
>> >> length back into the heap-stored length, with the MSB set;  if not,
>> >> allocate
>> >> a new array, copying the data from the original.
>> >> *  failure: allocate a new array, copying the data from the original.
>> >
>> > I don't think a primitive type should concern itself with thread 
>> > safety.
>> > So you can always keep simple length in the array's global length 
>> > field.
>>
>> Except that this works today without threading issues.  Two threads can
>> append to the same array, one appending in place and the other allocating 
>> a
>> new array (because the size got too big).  However, there are cases where 
>> it
>> doesn't work.
>
> I think it's a coincidence mostly related to thread-safety of the GC.
> Also, atomic swaps are not free, and additional ifs are not free.  I'm
> not sure this feature is required.

I don't know much about the penalty from atomic swap.  I'm not a CPU guru, 
so I assumed they were quick, at least quick enough to be acceptable.  If 
they aren't, then the proposal is still valid without the thread handling. 
You just need to implement threading constructs around arrays, similar to 
today.

BTW, you still need to do an if to check if the length is equal, even if you 
don't do the compare and swap.

-Steve 





More information about the Digitalmars-d mailing list