Is a moving GC really needed?
xs0
xs0 at xs0.com
Mon Oct 2 04:44:31 PDT 2006
Lionello Lunesu wrote:
> I've noticed that some design decisions are made with the possibility of
> a moving GC in mind. Will D indeed end up with a moving GC? If so, why?
> Is a moving GC really worth the extra trouble?
>
> Being able to move memory blocks reduces memory fragmentation, am I
> correct? Is this the only reason? (For the remainder of this post, I'm
> assuming it is.)
>
> I've experienced the problems of memory fragmentation first hand. In the
> project I'm working on (3D visualization software) I've had to track
> out-of-memory errors, which turned out to be because of virtual memory
> fragmentation. At some point, even a malloc/VirtualAlloc (the MS CRT's
> malloc directly calls VirtualAlloc for big memory blocks) for 80MB
> failed. Our problems were resolved by reserving a huge block (~1GB) of
> virtual memory at application start-up, to prevent third-party DLLs from
> fragmenting the virtual address space.
>
> One of the reasons we ran into problems with memory fragmentation was
> that Windows is actually only using 2GB of virtual address space. Using
> Windows Address Extension (a flag passed to the linker), however, it is
> possible to get the full 4GB of virtual address space available. That's
> an extra 2GB of continuous virtual address space! In the (near) future
> we'll have 2^64 bytes of virtual address space, which "should be enough
> for anyone".
>
> Is the extra complexity and run-time overhead of a moving GC worth the
> trouble, at this point in time?
While I'm no expert, I doubt a moving GC is even possible in a systems
language like D.
First, if you move things around, you obviously need to be precise when
updating pointers, lest all hell breaks loose. But how do you update
union {
int a;
void* b;
}
? While you could forbid overlap of pointers and non-pointer data, what
about custom allocators, assembler, C libraries (including OS runtime!),
etc.?
And second, for the generational case, you need an efficient way to
track references from older objects to newer objects, otherwise you need
to scan them all, defeating the point of having generations in the first
place. While a JIT-compiled language/runtime can relatively easily (and
efficiently) do this by injecting appropriate code into older objects, I
think it's practically impossible to do so with native code.
I've no idea how to overcome those without involving the end-user
(programmer) and/or losing quite a lot of speed during normal operation,
which I'm quite sure are not acceptable trade-offs.
On the bright side, I believe there's considerably less need to
heap-allocate in D than, say, in Java, and even when used, one can
overcome a bad(slow) GC in many cases (with stuff like malloc/free,
delete, etc.), so the performance of GC is not as critical.
If the compiler/GC were improved to differentiate between atomic and
non-atomic data (the latter contains pointers to other data, the first
doesn't), so memory areas that can't contain pointers don't get scanned
at all*, I think I'd already be quite happy with the state of things..
xs0
*) that may already be the case, but last time I checked it wasn't :)
More information about the Digitalmars-d
mailing list