GC.sizeOf(array.ptr)
Steven Schveighoffer via Digitalmars-d
digitalmars-d at puremagic.com
Tue Sep 30 08:46:53 PDT 2014
On 9/30/14 10:24 AM, Dicebot wrote:
> On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven Schveighoffer wrote:
>>> Assertion passes with D1/Tango runtime but fails with current D2
>>> runtime. This happens because `result.ptr` is not actually a pointer
>>> returned by gc_qalloc from array reallocation, but interior pointer 16
>>> bytes from the start of that block. Druntime stores some metadata
>>> (length/capacity I presume) in the very beginning.
>>
>> This is accurate, it stores the "used" size of the array. But it's
>> only the case for arrays, not general GC.malloc blocks.
>>
>> Alternative is to use result.capacity, which essentially looks up the
>> same thing (and should be more accurate). But it doesn't cover the
>> same inputs.
>
> Why is it stored in the beginning and not in the end of the block (like
> capacity)? I'd like to explore options of removing interior pointer
> completely before proceeding with adding more special cases to GC
> functions.
First, it is the capacity. It's just that the capacity lives at the
beginning of larger blocks.
The reason is due to the ability to extend pages.
With smaller blocks (2048 bytes or less), the page is divided into equal
portions, and those can NEVER be extended. Any attempt to extend results
in a realloc into another block. Putting the capacity at the end makes
sense for 2 reasons: 1. 1 byte is already reserved to prevent
cross-block pointers, 2. It doesn't cause alignment issues. We can't
very well offset a 16 byte block by 16 bytes. But importantly, the
capacity field does not move.
However, for page and above size (4096+ bytes), the original (D1 and
early D2) runtime would attempt to extend into the next page, without
moving the data. Thus we save the copy of data into a new block, and
just set some bits and we're done.
But this poses a problem for when the capacity field is stored at the
end -- especially since we are caching the block info. The block info
can change with a call to GC.extend (whereas a fixed-size block, the
block info CANNOT change). Depending on what "version" of the block info
you have, the "end" can be different, and you may end up corrupting
data. This is especially important for shared or immutable array blocks,
where multiple threads could be appending at the same time.
So I made the call to put it at the beginning of the block, which
obviously doesn't change, and offset everything by 16 bytes to maintain
alignment.
It may very well be that we can put it at the end of the block instead,
and you can probably do so without much effort in the runtime
(everything uses CTFE functions to calculate padding and location of the
capacity). It has been such a long time since I did that, I'm not very
sure of all the reasons not to do it. A look through the mailing list
archives might be useful.
-Steve
More information about the Digitalmars-d
mailing list