Is a moving GC really needed?
Chad J
"gamerChad\" at spamIsBad gmail.com
Mon Oct 2 14:31:28 PDT 2006
xs0 wrote:
>
> While I'm no expert, I doubt a moving GC is even possible in a systems
> language like D.
>
> First, if you move things around, you obviously need to be precise when
> updating pointers, lest all hell breaks loose. But how do you update
>
> union {
> int a;
> void* b;
> }
>
> ? While you could forbid overlap of pointers and non-pointer data, what
> about custom allocators, assembler, C libraries (including OS runtime!),
> etc.?
For the union, I might suggest a more acceptable tradeoff - mandate that
some data be inserted before every union to tell the gc which member is
selected at any moment during program execution. Whenever an assignment
is done to the union, code is inserted to update the union's status. So
your union would look more like this:
enum StructStatus
{
a,
b,
}
struct
{
StructStatus status; //or size_t, whichever works
union
{
int a;
void* b;
}
}
Now the GC can be precise with unions. Notice also the enum, which
would be nice to make available to userland - AFAIK many unions are
coded in a struct like that, so this will not be a loss in memory usage
for those cases, provided D exposes the implicit union information. At
any rate, unions seem pretty rare, so no one would notice the extra mem
usage.
Not sure how custom allocators mess up the GC, I thought these were just
on their own anyways. If a pointer to something is outside of the GC
heap, the GC doesn't bother changing it or collecting it or moving
anything.
Assembler is a bit tricky, maybe someone smarter than I can handle it
better, but here's a shot with some psuedoasm:
struct Foo
{
int member1;
int member2;
}
Foo bar;
...
Foo* foo = &bar;
int extracted;
// foo spotted in the assembly block, never mind the context
// as such, foo gets pinned.
asm
{
mov EAX, foo; // EAX = foo;
add EAX, 4; // EAX += 4;
mov extracted, [EAX]; // extracted = *EAX; or extracted = foo.member2;
}
// foo is unpinned here
As for C libraries, it seems like the same thing as custom allocators.
The C heap is outside of the GC's jurisdiction and won't be moved or
manipulated in any way. C code that handles D objects will have to be
careful, and the callee D code will have to pin the objects before the
go out into the unknown.
>
> On the bright side, I believe there's considerably less need to
> heap-allocate in D than, say, in Java, and even when used, one can
> overcome a bad(slow) GC in many cases (with stuff like malloc/free,
> delete, etc.), so the performance of GC is not as critical.
structs are teh rulez.
I'm still not comfortable with manual memory management in D though,
mostly because the standard lib (phobos) is built with GC in mind and
will probably leak the hell out of my program if I trust it too far.
Either that or I have to roll my own functions, which sucks, or I have
to be stuck with std.c which also sucks because it's not nearly as nice
as phobos IMO.
Mostly I agree with this though.
Also, I wonder, if I were to make a tool that does escape analysis on
your program, then finds that a number of classes can either be stack
allocated or safely deleted after they reach a certain point in the
code, then would this change the effectiveness of a generational GC?
Perhaps part of why young objects die so often is because they are
temporary things that can often be safely deleted at the end of scope or
some such.
>
> If the compiler/GC were improved to differentiate between atomic and
> non-atomic data (the latter contains pointers to other data, the first
> doesn't), so memory areas that can't contain pointers don't get scanned
> at all*, I think I'd already be quite happy with the state of things..
>
>
> xs0
>
> *) that may already be the case, but last time I checked it wasn't :)
I'd love this optimization. It doesn't seem too horribly hard to do
either. The GC needs a new heap and a new allocation function and the
compiler needs to be trained to use the new allocation function.
More information about the Digitalmars-d
mailing list