GC BlkAttr clarification. Programming in D pages 671, 672. About GC
Steven Schveighoffer
schveiguy at gmail.com
Thu Sep 4 03:05:38 UTC 2025
On Wednesday, 3 September 2025 at 23:05:45 UTC, H. S. Teoh wrote:
> On Wed, Sep 03, 2025 at 07:56:03PM +0000, Brother Bill via
> Digitalmars-d-learn wrote:
> [...]
>> C, C++ and D can play shenanigans with pointers, such as
>> casting them to size_t, which hides them from the GC.
>
> D's current GC is conservative, meaning that any value it sees
> that looks like it might be a pointer value, will be regarded
> as a pointer value.
>
> There is an optional precise GC that has been implemented, that
> can be turned on with compiled-in options or command line
> options, which uses a slightly less conservative scheme.
The recommendation is avoid *only* storing data in `size_t` that
points to an allocated block.
Even without the precise collector, the GC has pointer containing
blocks and no-pointer blocks. this means that it's quite easy to
accidentally only store a pointer in a `size_t` that will not be
scanned, even with the conservative GC.
You should only store pointers as `size_t` "if you know what you
are doing". Otherwise do not do this.
It is fine to make a temporary *copy* of a pointer to a `size_t`
for example to examine the bits inside. This should leave the
original pointer alone.
> [...]
>> GC.calloc can allocate memory for a slice of MyClass
>> instances. The developer may run GC.free to free the
>> allocated memory. But GC may perform its own garbage
>> collection of GC allocated memory blocks.
`GC.free` is going to free the memory. It will NOT run
finalizers. It will not collect it again later. I want to make
that clear.
If you do not explicitly free the memory, and it becomes garbage,
then the GC will collect it.
As far as a slice of `MyClass` instances, if you mean a slice of
data that contains the fields of an array of classes, you should
be very cautious of this. The GC is not equipped to call
finalizers on such a structure, and so you likely will run into
lifetime issues.
For classes, I'd just stick with `new`.
For structs, you can quite easily allocate an array of structs,
and the GC can support finalization of that. Also recommend just
using `new`.
>> Let's look at each attribute: (confirm if my analysis is
>> right,
>> otherwise correct)
>>
>> FINALIZE - just before GC reclaims the memory, such as with
>> GC.free,
>> call destructors, aka finalizers.
>
> This bit is probably best left untouched by user code, and left
> to the runtime to figure out when/how to use it.
In the latest compiler (2.111), this has been changed to a bit
that requests finalization upon allocation. The GC uses this bit
and the typeinfo passed in to determine the correct action. This
is different from before where the bit was an implementation
detail that you had to know what you are asking for.
I do agree that you should basically leave this alone. But for
sure the new treatment of the bit is more robust than before.
Note: changing bits after allocation *does not* take this into
account, at that point you are modifying implementation details.
I really would like to get rid of these bits completely and use
more reliable API (having a set of implementation bits as an
option is quite dangerous).
>> NO_SCAN - There may be false positives regarding byte values
>> that look like 'new' allocated pointers. This can result in
>> 'garbage' memory not being collected. If we are CERTAIN that
>> this memory block doesn't contain any pointers to 'new'
>> SomeClass allocated memory, then mark as NO_SCAN.
>
> Correct. Though if you're writing idiomatic D code, you'll
> almost never need to worry about this. Whenever you allocate
> an array whose elements are PODs (without any pointers), the
> allocator will automatically mark the memory NO_SCAN so that
> the GC doesn't waste time scanning such blocks. So things like
> implicit string allocations will be marked NO_SCAN, etc. If
> you're allocating an array or object that contains
> indirections, then NO_SCAN will not be set, so the GC will scan
> the interior of suc blocks for pointers to other live objects.
I will add that the concern of scanning non-pointers is pretty
much obsolete with 64-bit addressing. It's still important to use
`NO_SCAN`, as it's quite common to allocate large blocks of data
that are just bytes (e.g. load a file). You don't want to waste
time scanning that, even if there are no false-positives to be
found in there.
>> Question 1: if GC-calloc has allocated MyClass that
>> has a
>> string 'name' member, which may expand in size,
>> would be
>> still properly apply NO_SCAN.
I would say this is not true. A string has a pointer, it should
be scanned.
>> Question 2: if GC-calloc has allocated MyClass,
>> which may
>> allocate new MyStudent(...), would that mean 'don't
>> apply
>> NO_SCAN'?
>
> It's very simple. If a memory block may contain pointers, then
> it should not be NO_SCAN. If a memory block never contains any
> pointers, then it can (should) be marked NO_SCAN.
100% correct.
> Normal D code does not need to fiddle with GC flags.
Great advice!
>> NO_MOVE - For GC.realloc, if increasing memory allocated, and
>> it's not available, throw 'MEMORY_NOT_AVAILABLE' exception.
>
> Correct. You might want to use this flag if you have non-D code
> that might be holding pointers to this memory block, e.g., if
> you passed a pointer to some D array to C code which retains it
> in some C-managed pointer, and the C code expects the array to
> still be there later.
>
> It's not very often that such situations come up, though. When
> passing GC-allocated data to C code, it's generally a good idea
> to keep a reference to it inside D code so that the GC can find
> the reference anyway. Since D doesn't have a moving GC, this
> is really all you need to do. Again, unless you're doing
> something unusual, you probably don't need to touch the NO_MOVE
> flag.
No, this is not correct. `NO_MOVE` is supposed to mean that a
moving GC cannot move this block (and fix up pointers to it).
Given that we have a conservative GC, which scans the stack
conservatively *including C stacks*, and we will always have one,
I would say this bit should just be deprecated.
Indeed, it is completely ignored in the current GC.
>> APPENDABLE - For D internal runtime use. Don't mark this
>> yourself.
>
> Yes.
Also improved with D 2.111. The `APPENDABLE` bit is now an input
to malloc that tells the GC this is an array (including adjusting
the size to deal with padding space). The GC now handles array
runtime features directly, and so it understands what this means.
So in fact, this is a bit you can set, and there are currently
unexposed GC interface functions that can be used to manage the
array. They have not yet been exposed in `core.memory`, because
we are not sure if these are the final interfaces we want.
However, *allocating* an array with this bit will do exactly what
you expect (and managing the resulting array with the normal
array management functions such as appending or `capacity` will
work).
I do still recommend using `new`.
>> NO_INTERIOR - This says that only the base address of the
>> block may be a target address of other GC allocated pointers.
>> All other possible pointers are 'false' pointers.
Yes, though I would say it like:
"only pointers found while scanning that point to the exact
target address may be considered pointers to the block."
Again, this is really only of great use in 32-bit addressing.
>> Perhaps I am missing the fundamentals of various D garbage
>> collectors.
> [...]
>
> The various GC flags are simply hints that let you influence
> the scanning process to some extent. The NO_SCAN bit means that
> upon reaching this block, don't bother scanning its contents to
> find more pointers (because there are none). The NO_INTERIOR
> bit means that if the GC finds a pointer-like value that looks
> like it points to the inside of this block, ignore it as a
> non-pointer, because pointers to this block only ever point to
> its head (the supposed pointer is actually not a real pointer,
> but an integer value that happens to have a pointer-like value).
>
> The other flags have very specific uses that, if you don't know
> what they actually do, you probably don't need them and
> shouldn't touch them.
Flags you should be able to use:
* `NO_SCAN`
* `FINALIZE`
* `APPENDABLE`
* `NO_INTERIOR` (very cautiously)
Do not use any other bits directly. A future version of D likely
will migrate these into function parameters instead of providing
bits.
-Steve
More information about the Digitalmars-d-learn
mailing list