auto classes and finalizers

Sun Apr 9 16:12:50 PDT 2006

Bruno Medeiros wrote:
> Sean Kelly wrote:
> 
>> Jarrett Billingsley wrote:
>>
>>> "Sean Kelly" <sean at f4.ca> wrote in message 
>>> news:e10pk7$2khb$1 at digitaldaemon.com...
>>>
>>>>     - a type can have a destructor and/or a finalizer
>>>>     - the destructor is called upon a) explicit delete or b) at end 
>>>> of scope for auto objects
>>>>     - the finalizer is called if allocated on the gc heap and the
>>>>       destructor has not been called
>>>
>>>
>>> Would you mind explaining why exactly there needs to be a difference 
>>> between destructors and finalizers?  I've been following all the 
>>> arguments about this heap vs. auto classes and dtors vs. finalizers, 
>>> and I still can't figure out why destructors _can't be the 
>>> finalizers_.  Do finalizers do something fundamentally different from 
>>> destructors? 
>>
>>
>> Since finalizers are called when the GC destroys an object, they are 
>> very limited in what they can do.  They can't assume any GC managed 
>> object they have a reference to is valid, etc.  By contrast, 
>> destructors can make this assumption, because the object is being 
>> destroyed deterministically.  I think having both may be too confusing 
>> to be worthwhile, but it would allow for things like this:
>>
>>     class LinkedList {
>>         ~this() { // called deterministically
>>             for( Node n = top; n; ) {
>>                 Node t = n->next;
>>                 delete n;
>>                 n = t;
>>             }
>>             finalize();
>>          }
>>
>>          void finalize() { // called by GC
>>              // nodes may have already been destroyed
>>              // so leave them alone, but special
>>              // resources could be reclaimed
>>          }
>>     }
>>
>> The argument against finalizers, as Mike mentioned, is that you 
>> typically want to reclaim such special resources deterministically, so 
>> letting the GC take care of this 'someday' is of questionable utility.
>>
>>
>> Sean
> 
> 
> Ok, I think we can tackle this problem in a better way. So far, people 
> have been thinking about the fact that when destructors are called in a 
> GC cycle, they are called with finalizer semantics (i.e., you don't know 
> if the member references are valid or not, thus you can't use them).
> 
> This is a problem when in a destructor, one would like to destroy 
> component objects (as the Nodes of the LinkedList example).
> 
> 
> Some ideas where discussed here, but I didn't think any were fruitful. 
> Like:
>  *Forcing all classes with destructors to be auto classes -> doesn't add 
> any usefulness, instead just nuisances.
>  *Making the GC destroy objects in an order that makes members 
> references valid -> has a high performance cost and/or is probably just 
> not possible (circular references?).
> 
> 
> Perhaps another way would be to have the following behavior:
> - When a destructor is called during a GC (i.e., "as a finalizer") for 
> an object, then the member references are not valid and cannot be 
> referenced, *but they can be deleted*. It will be deleted iff it has not 
> been deleted already.
> I think this can be done without significant overhead. At the end of a 
> GC cycle, the GC has already a list of all objects that are to be 
> deleted. Thus, on the release phase, it could be modified to have a flag 
> indicating whether the object was already deleted or not. Thus when 
> LinkedList deletes a Node, the delete is only made if the object has 
> already been deleted or not.

If an instance is deleted by the GC, the pointers that it may have to 
other instances (of the same or instances of other classes) vanish. All 
of those other instances may or may not have other pointers pointing to 
them. So, deleting (or destructing) a particular instance, should not in 
any way "cascade" to those other instances.

On the next run, the GC _may_ notice that those other instances are not 
pointed-to by anything anymore, and then it may delete/destruct them.

---

So much for "regular" instance deletion. Then, we have the case where 
the instance "owns" some scarce resource (a file handle, a port, or some 
such). Such instances should be destructed in a _timely_ fashion _only_, 
right?

In other words, instances that need explicit destruction, should be 
destructed _at_the_moment_ they become obsolete -- and not "mañana".

It is conceivable that the "regular" instances do not have explicit 
destructors (after all, their memory footprint would just be released to 
the free pool), wherease the "resource owning" instances really do need 
an explicit destructor.

Thus, the existence of an explicit destructor should be a sign that 
makes [us, Walter, the compiler, anybody] understand that such an 
instance _needs_ to be destructed _right_away_.

This makes one think of "auto". Now, there have been several comments 
like /auto can't work/ because we don't know the scope of the instance. 
That is just BS. Every piece of source code should be written 
"hierarchically" (that is, not the entire program as "one function"). 
When one refactors the goings-on in the program to short procedures, 
then it all of a sudden is not too difficult to use "auto" to manage the 
lifetime of instances.

> Still, while the previous idea might be good, it's not the optimal, 
> because we are not clearly apperceiving the problem/issue at hand. What 
> we *really* want is to directly couple the lifecycle of a component 
> (member) object with it's composite (owner) object. A Node of a 
> LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't 
> even be a independent Garbage Collection managing element.
> 
> What we want is an allocator that allocates memory that is not to be 
> claimed by the GC (but which is to be scanned by the GC). It's behavior 
> is exactly like the allocator of 
> http://www.digitalmars.com/d/memory.html#newdelete but it should come 
> with the language and be available for all types. With usage like:
> 
>   class LinkedList {
>     ...
>     Add(Object obj) {
>       Node node = mnew Node(blabla);
>       ...
>     }
> 
> Thus, when the destructor is called upon a LinkedList, either 
> explicitly, or by the GC, the Node references will always be valid. One 
> has to be careful now, as mnew'ed object are effectively under manual 
> memory management, and so every mnew must have a corresponding delete, 
> lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
> be only sane solution to this problem.
> 
> 
> Another interesting addition, is to extend the concept of auto to class 
> members. Just as currently auto couples the lifecycle of a variable to 
> the enclosing function, an auto class member would couple the lifecycle 
> of its member to it's owner object. It would get deleted implicitly when 
> then owner object got deleted. Here is another (made up) example:
> 
>   class SomeUIWidget {
>     auto Color fgcolor;
>     auto Color bgcolor;
>     auto Size size;
>     auto Image image;
>     ...
> 
> The auto members would then have to be initialized on a constructor or 
> something (the exact restrictions might vary, such as being final or not).
> 
>