GC.sizeOf(array.ptr)

Steven Schveighoffer via Digitalmars-d digitalmars-d at puremagic.com
Tue Sep 30 08:46:53 PDT 2014


On 9/30/14 10:24 AM, Dicebot wrote:
> On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven Schveighoffer wrote:
>>> Assertion passes with D1/Tango runtime but fails with current D2
>>> runtime. This happens because `result.ptr` is not actually a pointer
>>> returned by gc_qalloc from array reallocation, but interior pointer 16
>>> bytes from the start of that block. Druntime stores some metadata
>>> (length/capacity I presume) in the very beginning.
>>
>> This is accurate, it stores the "used" size of the array. But it's
>> only the case for arrays, not general GC.malloc blocks.
>>
>> Alternative is to use result.capacity, which essentially looks up the
>> same thing (and should be more accurate). But it doesn't cover the
>> same inputs.
>
> Why is it stored in the beginning and not in the end of the block (like
> capacity)? I'd like to explore options of removing interior pointer
> completely before proceeding with adding more special cases to GC
> functions.

First, it is the capacity. It's just that the capacity lives at the 
beginning of larger blocks.

The reason is due to the ability to extend pages.

With smaller blocks (2048 bytes or less), the page is divided into equal 
portions, and those can NEVER be extended. Any attempt to extend results 
in a realloc into another block. Putting the capacity at the end makes 
sense for 2 reasons: 1. 1 byte is already reserved to prevent 
cross-block pointers, 2. It doesn't cause alignment issues. We can't 
very well offset a 16 byte block by 16 bytes. But importantly, the 
capacity field does not move.

However, for page and above size (4096+ bytes), the original (D1 and 
early D2) runtime would attempt to extend into the next page, without 
moving the data. Thus we save the copy of data into a new block, and 
just set some bits and we're done.

But this poses a problem for when the capacity field is stored at the 
end -- especially since we are caching the block info. The block info 
can change with a call to GC.extend (whereas a fixed-size block, the 
block info CANNOT change). Depending on what "version" of the block info 
you have, the "end" can be different, and you may end up corrupting 
data. This is especially important for shared or immutable array blocks, 
where multiple threads could be appending at the same time.

So I made the call to put it at the beginning of the block, which 
obviously doesn't change, and offset everything by 16 bytes to maintain 
alignment.

It may very well be that we can put it at the end of the block instead, 
and you can probably do so without much effort in the runtime 
(everything uses CTFE functions to calculate padding and location of the 
capacity). It has been such a long time since I did that, I'm not very 
sure of all the reasons not to do it. A look through the mailing list 
archives might be useful.

-Steve


More information about the Digitalmars-d mailing list