Array append proposal
Sergey Gromov
snake.scaly at gmail.com
Fri Oct 10 09:47:07 PDT 2008
Fri, 10 Oct 2008 11:11:40 -0400,
Steven Schveighoffer wrote:
> If this bit is set, then the array can be appended to in place if its length
> equals the length stored in the heap (and of course, the memory block
> contains enough space). If not, then a new array is allocated, and the data
> copied, the length stored in the heap is left alone.
Then you'd better keep the memory block size there, too. It's the
memory block size check which kills performance right now.
> * There are some threading issues to look at, but I am nowhere near as
> versed in this as some of you guys. I'd guess you can do something similar
> to Bartosz' thin locks? Except you would never expand the lock. The first
> time you lose contention, you assume you can't append in place, and allocate
> a new memory space. The only drawback is if the array which grabbed the
> lock decides it will allocate a new array as well, and the second attempt to
> append could have done it in place, you lose some performance there. But
> that's probably a rare corner case. The bit that is used for the 'is array
> head' can be used for lock contention. So something like this is stored in
> the heap:
>
> bit [31]: set if nobody is trying to expand array, unset if it is currently
> being expanded.
> bit [0:30]: the stored length
>
> So in order to expand an array, you compare and swap with your struct-local
> length+atBeginning, swapping it for 0.
> * success: check GC if block can accomodate new data; if so, store the new
> length back into the heap-stored length, with the MSB set; if not, allocate
> a new array, copying the data from the original.
> * failure: allocate a new array, copying the data from the original.
I don't think a primitive type should concern itself with thread safety.
So you can always keep simple length in the array's global length field.
> One final caveat. Currently there exists code like this for building a
> large array:
>
> int[] arr = new int[10000];
> arr.length = 0;
> for(int i = 0; i < 10000; i++)
> arr ~= i;
>
> In the new scheme, should the code:
>
> arr.length = 0;
>
> cause the array to try and shrink the heap-stored length to 0? My
> preference is no, as this is going to be much less inefficient than
> something like:
>
> int[] arr = new int[10000];
> for(int i = 0; i < 10000; i++)
> arr[i] = i;
Imagine the code
int[] arr = new int[10000];
int[] arr2 = arr;
arr.length = 0;
for(int i = 0; i < 10000; i++)
arr[i] = i;
arr2 is overwritten and you can do nothing about that. Downsize should
convert arr into an ordinary slice. So this sort of optimization stops
working. Well, it's a hack anyway but then you need a legitimate way of
pre-allocating an array leaving its length zero.
More information about the Digitalmars-d
mailing list