Array append proposal

Fri Oct 10 10:49:25 PDT 2008

Fri, 10 Oct 2008 13:30:14 -0400,
Steven Schveighoffer wrote:
> "Sergey Gromov" wrote
> > Fri, 10 Oct 2008 11:11:40 -0400,
> > Steven Schveighoffer wrote:
> >> If this bit is set, then the array can be appended to in place if its 
> >> length
> >> equals the length stored in the heap (and of course, the memory block
> >> contains enough space).  If not, then a new array is allocated, and the 
> >> data
> >> copied, the length stored in the heap is left alone.
> >
> > Then you'd better keep the memory block size there, too.  It's the
> > memory block size check which kills performance right now.
> 
> I'm not trying to fix performance.  I'm trying to fix slice corruption 
> issues (without adversely affecting performance).
> 
> However, it would be nice to have this stored also.  It's definitely 
> possible, but the overhead might be a bit too much.  Imagine allocating a 4 
> byte array, and having 8 bytes of overhead.

Garbage collector granularity is 16 so there is no overhead in your 
example.  A 9-byte array would take 2 memory blocks instead of one 
though.

> If it turns out everything needs to be 8-byte aligned (not sure, I'm not 
> really an expert about that kind of stuff), then yes, putting in a capacity 
> field would be a great addition.
> 
> >
> >> * There are some threading issues to look at, but I am nowhere near as
> >> versed in this as some of you guys.  I'd guess you can do something 
> >> similar
> >> to Bartosz' thin locks?  Except you would never expand the lock.  The 
> >> first
> >> time you lose contention, you assume you can't append in place, and 
> >> allocate
> >> a new memory space.  The only drawback is if the array which grabbed the
> >> lock decides it will allocate a new array as well, and the second attempt 
> >> to
> >> append could have done it in place, you lose some performance there.  But
> >> that's probably a rare corner case. The bit that is used for the 'is 
> >> array
> >> head' can be used for lock contention.  So something like this is stored 
> >> in
> >> the heap:
> >>
> >> bit [31]: set if nobody is trying to expand array, unset if it is 
> >> currently
> >> being expanded.
> >> bit [0:30]: the stored length
> >>
> >> So in order to expand an array, you compare and swap with your 
> >> struct-local
> >> length+atBeginning, swapping it for 0.
> >> *  success: check GC if block can accomodate new data;  if so, store the 
> >> new
> >> length back into the heap-stored length, with the MSB set;  if not, 
> >> allocate
> >> a new array, copying the data from the original.
> >> *  failure: allocate a new array, copying the data from the original.
> >
> > I don't think a primitive type should concern itself with thread safety.
> > So you can always keep simple length in the array's global length field.
> 
> Except that this works today without threading issues.  Two threads can 
> append to the same array, one appending in place and the other allocating a 
> new array (because the size got too big).  However, there are cases where it 
> doesn't work.

I think it's a coincidence mostly related to thread-safety of the GC.  
Also, atomic swaps are not free, and additional ifs are not free.  I'm 
not sure this feature is required.