-nogc

Thu Apr 23 04:49:22 PDT 2009

On 2009-04-23 06:58:38 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail at erdani.org> said:

> I've discussed something with Walter today and thought I'd share it here.
> 
> The possibility of using D without a garbage collector was always
> looming and has been used to placate naysayers ("you can call malloc if
> you want" etc.) but that opportunity has not been realized in a seamless
> manner. As soon as you concatenate arrays, add to a hash, or create an
> object, you will call into the GC.

Very true. It's pretty easy to call the GC without noticing in D.

> So I'm thinking there should be a flag -nogc that enables a different
> model of memory allocation. Here's the steps we need to take:
> 
> 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
> ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
> superdan suggested that when he wasn't busy cursing :o).

That makes sense.

> 2. Do the similar thing for associative arrays.
> 
> 3. Have two object.d at hand: one is "normal" and uses garbage
> collection, the other (call it object_nogc.d) has an entirely different
> definition for arrays, hashes, and Object.

Couldn't that just be a version switch, such as `version (D_NO_GC)` and 
`version (D_GC)`. Then you can implement things differently in other 
modules too when there is or there isn't a GC.

> 4. The definition of Object in object_nogc.d includes a reference count
> member for intrusive refcounting.
> 
> 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
> reference counting against T using ctors and dtor.
> 
> 6. At this point we already have a usable, credible no-gc offering: just
> use object_nogc.d instead of object.d and instead of "new Widget(args)"
> use "Ref!(Widget)(args)".

How's that going to work with scope classes?

	scope Widget = new Widget;
	scope Widget = Ref!(Widget)();

> 7. Add a -nogc option to the compiler. In that mode, the compiler
> replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
> "Ref!(T)(args)" for all classes T except inside
> object_nogc.d. The exception, as Walter pointed out, is to avoid
> infinite regression (how do you implement Ref if the reference you hold
> inside will also be wrapped in Ref???)

I'm just wondering, why wouldn't the compiler always use Ref!(T)? In 
the GC mode it'd simply resolve to a T, but if you wanted to experiment 
with another kind of GC -- say one which would require calling a 
notification function when writing a new value, such as the one in 
Objective-C 2.0 -- then you could.

Hum, also, how would it work for pointers to things in memory blocks 
that are normally managed by the GC? Would those increment the 
reference count for the memory block?

> 8. Well with this all a very solid offering of D without garbage
> collection would be available at a low cost!
> 
> One cool thing is that you can compile the same application with and
> without GC and test the differences easily. That's bound to show a
> number of interesting things!

Indeed.

> A disadvantage is that -nogc must be global - you can't link a program
> that's partially built with gc and partially without. This was a major
> counter-argument to adding optional gc to C++.

Another disadvantage is that you change the reference semantics and 
capabilities. With a GC, you can create circular pointer references and 
it won't leak memory once you stop referencing them. Do that with 
reference counting and you'll have memory leaks all around.

So with no GC, you have to have weak references if you're going to 
build tree structures where branches know about their parents (which is 
a pretty common thing). I'd suggest that weak references be put in the 
language so the compiler can replace them with WeakRef!(T) in no-GC 
mode and do something else in GC mode. Being in the language would just 
be some syntactic sugar that would make them more bearable in normal 
code.

As for compatibility, it may be worth looking at how it has been done 
in Objective-C 2.0. Objective-C has always used reference counting. 
Version 2.0 brought a GC. When you build a library, you have to specify 
whether the resulting binary expects a GC, reference counting, or can 
work in both modes. Something that works in both mode incurs a slight 
overhead, but sometime the binary compatibility is just worth it.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/