Reddit: SafeD - The Safe Subset of D

Kevin Bealer kevinbealer at gmail.com
Sun Mar 30 20:13:50 PDT 2008


== Quote from Chris Miller (lordSaurontheGreat at gmail.com)'s article
...
> > assumed for any object being collected'.  The result being that no
> > references to other collectable memory can be referenced from a
> > destructor.  Because of the collector, you don't need to either.
> > Destructors only need to manage non-collectable entities.
>
> Ah, so you're saying that reference counting is run according to scope
> relative to the sequential process line as it propagates down from
> main().   That makes sense, since those objects are no longer referenced
> by anything from the running scope, they're all garbage collectible.
> So yes, then the only solution would be to introduce a new rule to the
> garbage collector stipulating that objects with the least number of
> references from other objects in the collectable scope should be deleted
> first.  This would inherently slow down the garbage collector, thus
> presenting you with a tradeoff.
>
> What about something like scope(exit) object.doThis(); ?  Sort of like
> Tango's FileConduit?  Would that work to be able to manually force a
> ordered deletion?  It wouldn't be automatic by the garbage collector.
> In my experience a scope statement is a lot easier than the C++ way of
> carefully discovering when something is finally out of scope!  Just a
> thought.  I don't know, but it's a very interesting problem to think
> about!

Garbage collection can be done (to a degree) with reference counting, but
systems which use GC more often use an algorithm called "mark / sweep"
(or a variation of it.)  This has various advantages over reference
counting.  First, if you have circular pointers, such as blocks A and B
pointing to each other, they would never be collected in a reference
counting system; secondly, every time a reference counting system needs
to copy or overwrite a pointer, a count has to be adjusted somewhere,
and this means that a lot of algorithms run more slowly -- for example,
copying an array of pointers requires all the pointed-to blocks to be
"touched" in order to increase their refcounts.

The way mark/sweep collection works is that the program stack is scanned,
along with all global variables, and the stacks of any threads and thread
local storage, plus the actual machine registers.  This scan looks for
pointers or (in a "conservative" design) things which *might* be pointers.

Any of the blocks of (dynamically allocated) memory that is pointed to by
one of these (stack, global, or register) pointers is then considered to
also be "live".  This block is then scanned as well for pointers, and so
on.  Basically, if there is a series of pointers from a live area to a
specific block, that block is also live, recursively.  Any other area is
considered "garbage", because if you don't have a pointer to it anywhere
in the live set, you can't still be using it.

Mark/sweep doesn't have the problem of circular reference counts, but on
the other hand, there is no way to figure out which blocks were parents
of which others, so that destruction order is essentially random.  There
is no way to fix this reliably, especially since the circular links can
mean that the objects are both parents of each other -- so what order do
they get destroyed in?

In languages like D and C++, the garbage collection is conservative, and
this means that any pointer-sized block will be considered a pointer if
it contains a value that is an address of any area that might still be
live.

This means that a few "garbage" blocks can be kept around because they
are in a memory area which is pointed to by some random integer. This
also means that in practice, even for sets of memory blocks that are
not circular and might have an obvious destruction order could not be
guaranteed to be destroyed in the right order, because a random integer
in any of them might make them seem circular, so relying on any policy
that tried to detect circularity would be unreliable at best.

There are some ways to work around this though -- If object A needs
to call B.close(), then a reference to object B can be stored in a
global (or static) variable as well as in object A.  After object A's
destructor calls B.close(), then it should remove B from the global
table, thus making B garbage (B will not actually get freed until the
next GC cycle.)  (Make sure the table doesn't need to synchronize for
the "remove" step since that could cause deadlock, so an associative
array is probably a bad idea.)

Kevin


More information about the Digitalmars-d-announce mailing list