Destroying structs (literally)

Sat Aug 30 08:35:57 PDT 2014

On Saturday, 30 August 2014 at 15:18:52 UTC, Orvid King wrote:
> On 8/30/2014 4:22 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= 
> <schuetzm at gmx.net>" wrote:
>> On Saturday, 30 August 2014 at 03:54:41 UTC, Orvid King wrote:
>>> On 8/29/2014 2:52 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?=
>>> <schuetzm at gmx.net>" wrote:
>>>> On Friday, 29 August 2014 at 19:01:51 UTC, Andrei 
>>>> Alexandrescu wrote:
>>>>> On 8/29/14, 3:53 AM, "Marc Schütz" <schuetzm at gmx.net>" 
>>>>> wrote:
>>>>>> Jacob Carlborg just recently brought this up in another 
>>>>>> thread.
>>>>>> Isn't it
>>>>>> kind of consensus that calling a destructor from the GC is 
>>>>>> not a good
>>>>>> idea because of the restrictions that apply in this 
>>>>>> context? Andrei
>>>>>> even
>>>>>> wanted to deprecate destructors for classes because of 
>>>>>> this. Maybe a
>>>>>> better direction would be to separate the concepts of 
>>>>>> destruction and
>>>>>> finalization, and introduce two kinds of "destructors" for 
>>>>>> them.
>>>>>
>>>>> I think we need to stay with what we have. Adding a 
>>>>> distinct kind of
>>>>> destructor might be interesting. -- Andrei
>>>>
>>>> Our idea was that an additional destructor (let's call it a 
>>>> finalizer)
>>>> would be helpful because it is backward compatible. The 
>>>> compiler could
>>>> make some validity checks on it, at the least make it 
>>>> nothrow, maybe
>>>> @nogc (but I believe we can relax this restriction), pure 
>>>> (?).
>>>> Disallowing access to references (because they could pointer 
>>>> to already
>>>> destroyed objects) is unfortunately not feasible, because we 
>>>> can't
>>>> distinguish GC pointers from other ones. To avoid the need 
>>>> for code
>>>> duplication, finalizers could always be called implicitly by 
>>>> destructors
>>>> (assuming everything that is allowed in finalizers is also 
>>>> permitted in
>>>> destructors).
>>>>
>>>> Calling destructors from the GC could later be phased out. 
>>>> It is
>>>> technically not a breaking change, because there never was a 
>>>> guarantee
>>>> that they'd be called anyway.
>>>
>>> I would say that all of those restrictions, except for 
>>> nothrow, are
>>> dependent on the current GC implementation. It is possible to 
>>> write
>>> the GC in such a way that you can do GC allocations in a 
>>> destructor,
>>> as well as access any GC references you want. The only thing 
>>> with the
>>> GC references is that there's no way to guarantee that the 
>>> referenced
>>> objects won't have already had their destructor called when 
>>> the
>>> current destructor is being called.
>>
>> Hmmm... could the GC zero those references that it already 
>> destroyed,
>> before calling the finalizer? Don't know how this would affect
>> performance, but it would only be necessary if a finalizer 
>> exists (could
>> even be restricted to those references that are accessible from
>> non-trivial finalizers, i.e. if a struct has GCed pointers and 
>> an
>> embedded struct member with a finalizer, but no finalizer of 
>> its own,
>> the compiler would probably generate one that only calls the 
>> member's
>> finalizer, but this would have no access to its parent's 
>> pointers).
>>
>> You're right that many of the restrictions are only necessary 
>> because of
>> the current GC implementation. Even the fact that garbage 
>> collection can
>> happen in any thread could theoretically be changed. Even more
>> complicated: I can imagine that with the upcoming allocator 
>> work there
>> could be several different GC implementations, even used in 
>> parallel in
>> the same program, each with different capabilities and 
>> restrictions.
>> It's clear that this requires coordination.
>
> The references issue can be gotten around by marking an 
> allocation that needs finalization as if it were alive. It does 
> mean however that finalizable allocations will live through 
> more than one collection. I believe this is how .Net currently 
> handles them, as I don't remember anything in the spec about 
> restrictions on what's referenced in destructors, nor have I 
> had issues referencing otherwise dead allocations in them.

The problem is not only dereferencing those pointers (though 
depending on how the GC works even this might be racy, i.e. the 
memory location could have been reused already), but that a 
destructor/finalizer already ran on the referenced object. It is 
thus potentially in an invalid state. Even copying it is 
dangerous, because you're then creating a live object from that 
invalid state which will itself be destroyed again at some point. 
This might lead to double frees of depending manually allocated 
objects, or potentially other problems. Then there's the 
possibility to "resurrect" such an object by storing a reference 
to it somewhere else during finalization.

Doing the scanning and finalization separately like this is 
_probably_ safe:

1. Marking phase as usual.
2. Select the objects that have a finalizer, and clear all 
references in them that point to objects that are now 
unreachable. (This requires a precise GC.)
3. Call the finalizers.

Not sure what to do about things that may or may not be 
references.