auto classes and finalizers

Sun Apr 9 13:06:07 PDT 2006

Bruno Medeiros wrote:
> Sean Kelly wrote:
> 
>> Jarrett Billingsley wrote:
>>
>>> "Sean Kelly" <sean at f4.ca> wrote in message 
>>> news:e10pk7$2khb$1 at digitaldaemon.com...
>>>
>>>>     - a type can have a destructor and/or a finalizer
>>>>     - the destructor is called upon a) explicit delete or b) at end 
>>>> of scope for auto objects
>>>>     - the finalizer is called if allocated on the gc heap and the
>>>>       destructor has not been called
>>>
>>>
>>> Would you mind explaining why exactly there needs to be a difference 
>>> between destructors and finalizers?  I've been following all the 
>>> arguments about this heap vs. auto classes and dtors vs. finalizers, 
>>> and I still can't figure out why destructors _can't be the 
>>> finalizers_.  Do finalizers do something fundamentally different from 
>>> destructors? 
>>
>>
>> Since finalizers are called when the GC destroys an object, they are 
>> very limited in what they can do.  They can't assume any GC managed 
>> object they have a reference to is valid, etc.  By contrast, 
>> destructors can make this assumption, because the object is being 
>> destroyed deterministically.  I think having both may be too confusing 
>> to be worthwhile, but it would allow for things like this:
>>
>>     class LinkedList {
>>         ~this() { // called deterministically
>>             for( Node n = top; n; ) {
>>                 Node t = n->next;
>>                 delete n;
>>                 n = t;
>>             }
>>             finalize();
>>          }
>>
>>          void finalize() { // called by GC
>>              // nodes may have already been destroyed
>>              // so leave them alone, but special
>>              // resources could be reclaimed
>>          }
>>     }
>>
>> The argument against finalizers, as Mike mentioned, is that you 
>> typically want to reclaim such special resources deterministically, so 
>> letting the GC take care of this 'someday' is of questionable utility.
>>
>>
>> Sean
> 
> 
> Ok, I think we can tackle this problem in a better way. So far, people 
> have been thinking about the fact that when destructors are called in a 
> GC cycle, they are called with finalizer semantics (i.e., you don't know 
> if the member references are valid or not, thus you can't use them).
> 
> This is a problem when in a destructor, one would like to destroy 
> component objects (as the Nodes of the LinkedList example).
> 
> 
> Some ideas where discussed here, but I didn't think any were fruitful. 
> Like:
>  *Forcing all classes with destructors to be auto classes -> doesn't add 
> any usefulness, instead just nuisances.
>  *Making the GC destroy objects in an order that makes members 
> references valid -> has a high performance cost and/or is probably just 
> not possible (circular references?).
> 
> 
> Perhaps another way would be to have the following behavior:
> - When a destructor is called during a GC (i.e., "as a finalizer") for 
> an object, then the member references are not valid and cannot be 
> referenced, *but they can be deleted*. It will be deleted iff it has not 
> been deleted already.
> I think this can be done without significant overhead. At the end of a 
> GC cycle, the GC has already a list of all objects that are to be 
> deleted. Thus, on the release phase, it could be modified to have a flag 
> indicating whether the object was already deleted or not. Thus when 
> LinkedList deletes a Node, the delete is only made if the object has 
> already been deleted or not.
> 
> 
> Still, while the previous idea might be good, it's not the optimal, 
> because we are not clearly apperceiving the problem/issue at hand. What 
> we *really* want is to directly couple the lifecycle of a component 
> (member) object with it's composite (owner) object. A Node of a 
> LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't 
> even be a independent Garbage Collection managing element.
> 
> What we want is an allocator that allocates memory that is not to be 
> claimed by the GC (but which is to be scanned by the GC). It's behavior 
> is exactly like the allocator of 
> http://www.digitalmars.com/d/memory.html#newdelete but it should come 
> with the language and be available for all types. With usage like:
> 
>   class LinkedList {
>     ...
>     Add(Object obj) {
>       Node node = mnew Node(blabla);
>       ...
>     }
> 
> Thus, when the destructor is called upon a LinkedList, either 
> explicitly, or by the GC, the Node references will always be valid. One 
> has to be careful now, as mnew'ed object are effectively under manual 
> memory management, and so every mnew must have a corresponding delete, 
> lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
> be only sane solution to this problem.
> 
> 
> Another interesting addition, is to extend the concept of auto to class 
> members. Just as currently auto couples the lifecycle of a variable to 
> the enclosing function, an auto class member would couple the lifecycle 
> of its member to it's owner object. It would get deleted implicitly when 
> then owner object got deleted. Here is another (made up) example:
> 
>   class SomeUIWidget {
>     auto Color fgcolor;
>     auto Color bgcolor;
>     auto Size size;
>     auto Image image;
>     ...
> 
> The auto members would then have to be initialized on a constructor or 
> something (the exact restrictions might vary, such as being final or not).
> 
> 

Regardless of how it's implemented, what's needed is a bit of 
consistency. Currently, dtors are invoked with two entirely different 
world-states: with valid state, and with unspecified state. What makes 
this generally unworkable is the fact that (a) the difference in state 
is often critical to the operation of the dtor, and (b) there's no clean 
way to tell the difference.

I use a bit of a hack to distinguish between the two: a common module 
has a global variable set to true when the enclosing module-dtor is 
invoked. This obviously depends upon module-dtors being first (which 
they currently are, but that is not in the spec). Most of you will 
probably be going "eww" at this point, but it's the only way I found to 
make dtors consistent and thus usable. Further, this is only workable if 
the dtor() itself can be abandoned when in state (b) above; prohibiting 
the use of dtors for a whole class of cleanup concerns, and forcing one 
to defer to the dispose() or close() pattern ~~ some say anti-pattern.

As I understand it, the two states correspond to (1) an explicit 
'delete' of the object, which includes "auto" usage; and (2) implicit 
cleanup via the GC.

The suggestion to restrict dtor to 'auto' classes is a means to limit 
dtors to #1 above; thus at least making them consistent. However, there 
are common cases that #1 does not allow for (I'm thinking specifically 
of object lifetimes that are not related to scope ~ such as time-based). 
That would need to be addressed somehow?

Turning to your suggestions ~ the 'marking' of references such that they 
can be "deleted" multiple times is perhaps questionable, partly because 
it appears to be specific to the GC implementation? I imagine an 
incremental collector would have problems with this approach, even if it 
were workable with a "stop the world" collector? I don't know for sure, 
but suspect there'd be issues there somewhere.

Whatever the resolution, consistency should be the order of the day.

- Kris