Is a moving GC really needed?

Wed Oct 4 02:47:06 PDT 2006

Chad J > wrote:
> xs0 wrote:
>>
>> While I'm no expert, I doubt a moving GC is even possible in a systems 
>> language like D.
>>
>> First, if you move things around, you obviously need to be precise 
>> when updating pointers, lest all hell breaks loose. But how do you update
>>
>> union {
>>     int a;
>>     void* b;
>> }
>>
>> ? While you could forbid overlap of pointers and non-pointer data, 
>> what about custom allocators, assembler, C libraries (including OS 
>> runtime!), etc.?
> 
> For the union, I might suggest a more acceptable tradeoff - mandate that 
> some data be inserted before every union to tell the gc which member is 
> selected at any moment during program execution.  Whenever an assignment 
> is done to the union, code is inserted to update the union's status.  So 
> your union would look more like this:
> 
> enum StructStatus
> {
>   a,
>   b,
> }
> 
> struct
> {
>   StructStatus status; //or size_t, whichever works
> 
>   union
>   {
>     int a;
>     void* b;
>   }
> }

Well, you can't simply mandate that, I think.. It can be the case that 
the exact structure of a struct/union is mandated by requirements 
external to the application..

And, even if it was an option, it doesn't really solve the precision 
issue. To be precise, the GC must know the internal structure of 
anything on its heap, which I don't think is possible unless the 
language was severely restricted or it was required that the programmer 
supplies that information manually whenever the purpose of a part of 
memory changes from pointer to non-pointer and vice-versa. Both options 
suck quite a lot.

> Not sure how custom allocators mess up the GC, I thought these were just 
> on their own anyways.  If a pointer to something is outside of the GC 
> heap, the GC doesn't bother changing it or collecting it or moving 
> anything.

That's true, but a custom allocator can use the GC heap if it wants. For 
example, if I know I'll be allocating a million linked list nodes, and 
will then release them all at once, it can be considerably faster to 
allocate a large byte[] and use chunks of it for my nodes. But how can 
the GC tell which of those bytes are pointers and which aren't? At the 
beginning none are, but later some may be..

> Assembler is a bit tricky, maybe someone smarter than I can handle it 
> better, but here's a shot with some psuedoasm:
> 
> struct Foo
> {
>   int member1;
>   int member2;
> }
> Foo bar;
> 
> ...
> 
> Foo* foo = &bar;
> int extracted;
> // foo spotted in the assembly block, never mind the context
> // as such, foo gets pinned.
> asm
> {
>   mov EAX, foo;         // EAX = foo;
>   add EAX, 4;           // EAX += 4;
>   mov extracted, [EAX]; // extracted = *EAX; or extracted = foo.member2;
> }
> // foo is unpinned here

I'm sure simple cases can be detected, but am also fairly certain it's 
impossible to generally determine which registers hold pointers and 
which don't, and/or what will get referenced and what won't.

> As for C libraries, it seems like the same thing as custom allocators. 
> The C heap is outside of the GC's jurisdiction and won't be moved or 
> manipulated in any way.  C code that handles D objects will have to be 
> careful, and the callee D code will have to pin the objects before the 
> go out into the unknown.

I meant precision again, not C's heap. Even if compiled D code somehow 
notifies the GC what's on the stack, the C code definitely won't. Then, 
how can the GC tell which parts of the stack point to its own heap, and 
which are just integers that happen to look like they could point to the 
heap?

> Also, I wonder, if I were to make a tool that does escape analysis on 
> your program, then finds that a number of classes can either be stack 
> allocated or safely deleted after they reach a certain point in the 
> code, then would this change the effectiveness of a generational GC? 
> Perhaps part of why young objects die so often is because they are 
> temporary things that can often be safely deleted at the end of scope or 
> some such.

Well, I think it's quite hard to guess whether the result would be 
better or worse, speed-wise. Obviously, allocating those objects on the 
stack would speed things up, while using generations in the GC would be 
less effective, because the GC would see less short-lived objects. I've 
no idea, however, how those two effects would combine.. Most probably it 
depends very much on the actual application (as do most things ;)

xs0