draft proposal for ref counting in D

Wed Oct 9 18:41:43 PDT 2013

Rainer Schuetze wrote:

On 25.06.2013 23:00, Walter Bright wrote:
 > Updated incorporating Steven's suggestion, and some comments about
 > shared/const/mutable/purity.
 >
 > -------------------------------------------------------------
 >
 >      Adding Reference Counting to D

Cool. I didn't expect this to be tackled so soon.

 >
[...]
 >
 > 3. A ref counted object is inherently a reference type, not a value type.

As Michel also said, the reference count does not have to be in inside the 
object itself, so we might want to allow reference counting on other types aswell.

 >
 > 4. The compiler needs to know about ref counted types.

I imagine a few (constrained) templated functions for the different operations 
defined in the library could also do the job, though it might drown compilation 
speed. Also getting help from the optimizer to remove redundant calls will need 
some back doors.

 >
 >
 > ==Proposal==
 >
 > If a class contains the following methods, in either itself or a base
 > class, it is
 > an RC class:
 >
 >
 >      T AddRef();
 >      T Release();

Is T typeof(this) here?

I don't think we should force linking this functionality with COM, the 
programmer can do this with a simple wrapper.

 >
 > An RC class is like a regular D class with these additional semantics:
 >
 > 1. In @safe code, casting (implicit or explicit) to a base class that
 > does not
 > have both AddRef() and Release() is an error.
 >
 > 2. Initialization of a class reference causes a call to AddRef().
 >
 > 3. Assignment to a class reference causes a call to AddRef() on the new
 > value
 > followed by a call to Release() on its original value.

It might be common knowledge, but I want to point out that the usual COM 
implementation (atomic increment/decrement and free when refcount goes down to 
0) is not thread-safe for shared pointers. That means you either have to guard 
all reads and writes with a lock to make the full assignment atomic or have to 
implement reference counting very different (e.g. deferred reference counting).

 >
 > 4. Null checks are done before calling any AddRef() or Release().
 >
 > 5. Upon scope exit of all RC variables or temporaries, a call to
 > Release() is performed,
 > analogously to the destruction of struct variables and temporaries.
 >
 > 6. If a class or struct contains RC fields, calls to Release() for those
 > fields will
 > be added to the destructor, and a destructor will be created if one
 > doesn't exist already.
 >
 > 7. If a closure is created that contains RC fields, either a compile
 > time error will be
 > generated or a destructor will be created for it.
 >
 > 8. Explicit calls to AddRef/Release will not be allowed in @safe code.
 >
 > 9. A call to AddRef() will be added to any argument passed as a parameter.
 >
 > 10. Function returns have an AddRef() already done to the return value.
 >
 > 11. The compiler can elide any AddRef()/Release() calls it can prove are
 > redundant.
 >
 > 12. AddRef() is not called when passed as the implicit 'this' reference.
 >

Isn't this unsafe if a member function is called through the last existing 
reference and this reference is then cleared during execution of this member 
function or from another thread?

 > 13. Taking the address of, or passing by reference, any fields of an RC
 > object
 > is not allowed in @safe code. Passing by reference an RC field is allowed.

Please note that this includes slices to fixed size arrays.

 >
 > 14. RC objects will still be allocated on the GC heap - this means that
 > a normal
 > GC run will reap RC objects that are in a cycle, and RC objects will get
 > automatically
 > scanned for heap references with no additional action required by the user.
 >
 > 15. The class implementor will be responsible for deciding whether or
 > not to support
 > sharing. Casting to shared is already disallowed in @safe code, so this
 > is only
 > viable in system code.
 >
 > 16. RC objects cannot be const or immutable.

This is a bit of a downer. If the reference count is not within the object, this 
can be implemented.

 >
 > 17. Can RC objects be arguments to pure functions?
 >
 > ==Existing Code==
 >
 > D COM objects already have AddRef() and Release(). This proposal should
 > not break
 > that code, it'll just mean that there will be extra AddRef()/Release
 > calls made.
 > Calling AddRef()/Release() should never have been allowed in @safe code
 > anyway.
 >
 > Any other existing uses of AddRef()/Release() will break.
 >
 > ==Arrays==
 >
 > Built-in arrays have no place to put a reference count. Ref counted
 > arrays would hence
 > become a library type, based on a ref counted class with overloaded
 > operators for
 > the array operations.
 >
 > ==Results==
 >
 > This is a very flexible approach, allowing for support of general RC
 > objects, as well
 > as specific support for COM objects and Objective-C ARC.
 > AddRef()/Release()'s implementation
 > is entirely up to the user or library writer.
 >
 > @safe code can be guaranteed to be memory safe, as long as
 > AddRef()/Release() are correctly
 > implemented.
 >

I feel I'm hijacking this proposal, but the step to library defined read/write 
barriers seems pretty small. Make AddRef, Release and assignment free template 
functions, e.g.

void ptrConstruct(T,bool stackOrHeap)(T*adr, T p);
void ptrAssign(T,bool stackOrHeap)(T*adr, T p);
void ptrRelease(T,bool stackOrHeap)(T*adr);

and we are able to experiment with all kinds of sophisticated GC algorithms 
including RC. Eliding redundant addref/release pairs would need some extra 
support though, I read that LLVM does something like this, but I don't know how.