GC BlkAttr clarification. Programming in D pages 671, 672. About GC

Steven Schveighoffer schveiguy at gmail.com
Thu Sep 4 03:05:38 UTC 2025


On Wednesday, 3 September 2025 at 23:05:45 UTC, H. S. Teoh wrote:
> On Wed, Sep 03, 2025 at 07:56:03PM +0000, Brother Bill via 
> Digitalmars-d-learn wrote:
> [...]
>> C, C++ and D can play shenanigans with pointers, such as 
>> casting them to size_t, which hides them from the GC.
>
> D's current GC is conservative, meaning that any value it sees 
> that looks like it might be a pointer value, will be regarded 
> as a pointer value.
>
> There is an optional precise GC that has been implemented, that 
> can be turned on with compiled-in options or command line 
> options, which uses a slightly less conservative scheme.

The recommendation is avoid *only* storing data in `size_t` that 
points to an allocated block.

Even without the precise collector, the GC has pointer containing 
blocks and no-pointer blocks. this means that it's quite easy to 
accidentally only store a pointer in a `size_t` that will not be 
scanned, even with the conservative GC.

You should only store pointers as `size_t` "if you know what you 
are doing". Otherwise do not do this.

It is fine to make a temporary *copy* of a pointer to a `size_t` 
for example to examine the bits inside. This should leave the 
original pointer alone.

> [...]
>> GC.calloc can allocate memory for a slice of MyClass 
>> instances.  The developer may run GC.free to free the 
>> allocated memory.  But GC may perform its own garbage 
>> collection of GC allocated memory blocks.

`GC.free` is going to free the memory. It will NOT run 
finalizers. It will not collect it again later. I want to make 
that clear.

If you do not explicitly free the memory, and it becomes garbage, 
then the GC will collect it.

As far as a slice of `MyClass` instances, if you mean a slice of 
data that contains the fields of an array of classes, you should 
be very cautious of this. The GC is not equipped to call 
finalizers on such a structure, and so you likely will run into 
lifetime issues.

For classes, I'd just stick with `new`.

For structs, you can quite easily allocate an array of structs, 
and the GC can support finalization of that. Also recommend just 
using `new`.

>> Let's look at each attribute:  (confirm if my analysis is 
>> right,
>> otherwise correct)
>> 
>> FINALIZE - just before GC reclaims the memory, such as with 
>> GC.free,
>>            call destructors, aka finalizers.
>
> This bit is probably best left untouched by user code, and left 
> to the runtime to figure out when/how to use it.

In the latest compiler (2.111), this has been changed to a bit 
that requests finalization upon allocation. The GC uses this bit 
and the typeinfo passed in to determine the correct action. This 
is different from before where the bit was an implementation 
detail that you had to know what you are asking for.

I do agree that you should basically leave this alone. But for 
sure the new treatment of the bit is more robust than before.

Note: changing bits after allocation *does not* take this into 
account, at that point you are modifying implementation details. 
I really would like to get rid of these bits completely and use 
more reliable API (having a set of implementation bits as an 
option is quite dangerous).

>> NO_SCAN - There may be false positives regarding byte values 
>> that look like 'new' allocated pointers.  This can result in 
>> 'garbage' memory not being collected.  If we are CERTAIN that 
>> this memory block doesn't contain any pointers to 'new' 
>> SomeClass allocated memory, then mark as NO_SCAN.
>
> Correct.  Though if you're writing idiomatic D code, you'll 
> almost never need to worry about this.  Whenever you allocate 
> an array whose elements are PODs (without any pointers), the 
> allocator will automatically mark the memory NO_SCAN so that 
> the GC doesn't waste time scanning such blocks.  So things like 
> implicit string allocations will be marked NO_SCAN, etc.  If 
> you're allocating an array or object that contains 
> indirections, then NO_SCAN will not be set, so the GC will scan 
> the interior of suc blocks for pointers to other live objects.

I will add that the concern of scanning non-pointers is pretty 
much obsolete with 64-bit addressing. It's still important to use 
`NO_SCAN`, as it's quite common to allocate large blocks of data 
that are just bytes (e.g. load a file). You don't want to waste 
time scanning that, even if there are no false-positives to be 
found in there.

>>           Question 1: if GC-calloc has allocated MyClass that 
>> has a
>>           string 'name' member, which may expand in size, 
>> would be
>>           still properly apply NO_SCAN.

I would say this is not true. A string has a pointer, it should 
be scanned.

>>           Question 2: if GC-calloc has allocated MyClass, 
>> which may
>>           allocate new MyStudent(...), would that mean 'don't 
>> apply
>>           NO_SCAN'?
>
> It's very simple.  If a memory block may contain pointers, then 
> it should not be NO_SCAN.  If a memory block never contains any 
> pointers, then it can (should) be marked NO_SCAN.

100% correct.

> Normal D code does not need to fiddle with GC flags.

Great advice!

>> NO_MOVE - For GC.realloc, if increasing memory allocated, and 
>> it's not available, throw 'MEMORY_NOT_AVAILABLE' exception.
>
> Correct. You might want to use this flag if you have non-D code 
> that might be holding pointers to this memory block, e.g., if 
> you passed a pointer to some D array to C code which retains it 
> in some C-managed pointer, and the C code expects the array to 
> still be there later.
>
> It's not very often that such situations come up, though.  When 
> passing GC-allocated data to C code, it's generally a good idea 
> to keep a reference to it inside D code so that the GC can find 
> the reference anyway.  Since D doesn't have a moving GC, this 
> is really all you need to do.  Again, unless you're doing 
> something unusual, you probably don't need to touch the NO_MOVE 
> flag.

No, this is not correct. `NO_MOVE` is supposed to mean that a 
moving GC cannot move this block (and fix up pointers to it).

Given that we have a conservative GC, which scans the stack 
conservatively *including C stacks*, and we will always have one, 
I would say this bit should just be deprecated.

Indeed, it is completely ignored in the current GC.

>> APPENDABLE - For D internal runtime use.  Don't mark this 
>> yourself.
>
> Yes.

Also improved with D 2.111. The `APPENDABLE` bit is now an input 
to malloc that tells the GC this is an array (including adjusting 
the size to deal with padding space). The GC now handles array 
runtime features directly, and so it understands what this means.

So in fact, this is a bit you can set, and there are currently 
unexposed GC interface functions that can be used to manage the 
array. They have not yet been exposed in `core.memory`, because 
we are not sure if these are the final interfaces we want.

However, *allocating* an array with this bit will do exactly what 
you expect (and managing the resulting array with the normal 
array management functions such as appending or `capacity` will 
work).

I do still recommend using `new`.

>> NO_INTERIOR - This says that only the base address of the 
>> block may be a target address of other GC allocated pointers.  
>> All other possible pointers are 'false' pointers.

Yes, though I would say it like:

"only pointers found while scanning that point to the exact 
target address may be considered pointers to the block."

Again, this is really only of great use in 32-bit addressing.

>> Perhaps I am missing the fundamentals of various D garbage 
>> collectors.
> [...]
>
> The various GC flags are simply hints that let you influence 
> the scanning process to some extent. The NO_SCAN bit means that 
> upon reaching this block, don't bother scanning its contents to 
> find more pointers (because there are none). The NO_INTERIOR 
> bit means that if the GC finds a pointer-like value that looks 
> like it points to the inside of this block, ignore it as a 
> non-pointer, because pointers to this block only ever point to 
> its head (the supposed pointer is actually not a real pointer, 
> but an integer value that happens to have a pointer-like value).
>
> The other flags have very specific uses that, if you don't know 
> what they actually do, you probably don't need them and 
> shouldn't touch them.

Flags you should be able to use:

* `NO_SCAN`
* `FINALIZE`
* `APPENDABLE`
* `NO_INTERIOR` (very cautiously)

Do not use any other bits directly. A future version of D likely 
will migrate these into function parameters instead of providing 
bits.

-Steve


More information about the Digitalmars-d-learn mailing list