Idea: "Explicit" Data Types

Craig Black craigblack2 at cox.net
Tue Apr 1 10:10:22 PDT 2008


Before I get into my proposal, I want to vote for stack maps to be added to 
D.  IMO, stack maps are the next logical step to making the GC faster.  They 
don't require a fundamental shift in the library like a moving GC would. 
Once stack maps are added, then perhaps the following proposal should be 
considered to glean additional GC performance.

I'm not stuck on terminology here, so if you don't like the term "explicit" 
because it's too overloaded, that's fine with me.  Pick another term.  The 
concept is what's important.  This proposal is about getting GC and explicit 
memory management to play well together.  The idea is to give the compiler 
information that allows the GC to scan less data, and hence perform better. 
Let's start with a class that uses explcit memory management.

class Foo
{
public:
    new(size_t sz) { return std.c.stdlib.malloc(sz); }
    delete(void* p) { std.c.stdlib.free(p); }
}

This works fine, but doesn't tell the compiler whether data referenced by 
Foo is allocated on the GC heap or not.  If we preceded the class with some 
kind of qualifier, like "explicit", this would indicate to the compiler that 
data referenced by Foo is not allocated on the heap.  Note: this constraint 
can't be enforced by the compiler, but could be enforced via run-time debug 
assertions.

explicit class Foo
{
public:
    new(size_t sz) { return std.c.stdlib.malloc(sz); }
    delete(void* p) { std.c.stdlib.free(p); }
}

A problem here arises because even though Foo is allocated on the malloc 
heap, it could contain references, pointers, or arrays that touch the GC 
heap.  Thus, making Foo "explicit" also denotes that any reference, pointer 
or array contained by Foo is also explicit, and therefore does not refer to 
data on the GC heap.  Interestingly, this means that "explicit" would have 
to be transitive, like D's const.

Thus, for the explicit qualifier to be useful, it must be able to be applied 
to a struct, class, pointer, reference, or array type.  However, it doesn't 
make sense to apply it to primitive or POD types.  If you follow my logic 
you understand what explicit types can do.  They inform the compiler that no 
GC heap data will be referenced, so that the compiler can exclude explicit 
types from GC scanning.  Further, the use of explicit can be enforced via 
run-time debug assertions.  Note that there are a few implementation details 
that I'm ignoring now for simplicity sake.

-Craig







More information about the Digitalmars-d mailing list