More precise GC

William T. Fnk wtfmail at somewheredomain.net
Sun Mar 28 07:43:54 PDT 2010


bearophile Wrote:

> This is just a small invention.
> Enums can decrease the precision of the GC, because fields can be pointers or values, an example:
> 
> 
> enum NodeType { value, sum, product }
> 
> struct Node {
>     NodeType type;
> 
>     union {
>         double val;
>         struct {
>             Node* left, right;
>         }
>     }
> }
> 
> (Node can be used to build a simple expression tree, that can contain things like: 2.5 * 3.0 + 2.0).
> 
> In this case the GC has to use the safe strategy of always following the left and right pointers, even when they are a double.
> 
> At runtime the program usually knows what's inside the enum, if the enum truly contains pointers. This information can be inside a tag as in this example, inside a tagged pointer that points to the struct/enum, or at worst in the code that uses the struct/enum. But generally the compiler is not able to use this information, because a compiler usually doesn't understand the program semantics.
> 
> So I can think of a new optional standard class/struct/enum method, for example gcfollow() that the garbage collector recognizes when it is present and useful. It gets called with the attribute name and returns true at runtime if the the GC has to follow a pointer/reference:
> 
> 
> struct Node {
>     NodeType type;
> 
>     union {
>         double val;
>         struct {
>             Node* left, right;
>         }
>     }
> 
>     bool gcfollow(string field)() {
>         return type != NodeType.value;
>     }
> }
> 
> 
> Note that the GC will call gcfollow() (here a template for more efficiency) only with the strings of "left" or "right", because "type" and "val" attributes can't contain pointers/references.
> 
> So this gcfollow() will be instantiated only with "left" or "right", and in both cases it returns true if the node isn't a value, so the GC will not mistake the "val" double in the leaves of this expression tree as a couple of pointers. The GC will follow the left and right pointers only for the internal nodes of the tree. This slows down the GC a little but increases its precision (if gcfollow() has no bugs).
> 
> If gcfollow() is not present the GC acts as it does now, assuming the pointers are always present.
> 
> If a struct contains no pointers/references the gcfollow() is never used. If a struct contains another struct, the GC will call gcfollow() of the inner struct too if it contains pointers/references, etc.
> 
> Bye,
> bearophile

This is a rather ridiculous way of emulating algebraic data types and value-oriented programming. But then again, that kind of features might fit perfectly to D or at least provide some food for thought for further bikeshedding.



More information about the Digitalmars-d mailing list