D vs Java

Sean Kelly sean at f4.ca
Wed Mar 22 11:42:13 PST 2006


Matthias Spycher wrote:
> Sean Kelly wrote:
>> Matthias Spycher wrote:
> ...
>>> True, but accurate garbage collection is a requirement if you're 
>>> going to scale to support large, long-running applications. C-pointer 
>>> functionality eliminates the potential. The D community might (in the 
>>> future) consider the introduction of a managed D subset that would 
>>> make accurate GC possible.
>>
>> The D standard doesn't have any language that prevents this.  I think 
>> it would be quite possible to implement an incremental GC in D if one 
>> had control over code generation.
> 
> Are you suggesting whole-program analysis? Would one have to compile a 
> certain way ahead-of-time to support a particular kind of GC? If yes, do 
> you think that's practical? How would you deal with shared libraries 
> that may be dynamically loaded -- fall back to a conservative GC?

Incremental garbage collection requires compiler support, and it does 
basically involve whole-progam analysis.  For those unfamiliar with 
incremental GC, the way it works (even in Java) is for the compiler to 
inject code around all pointer modifications which signals the GC that 
the pointer has changed.  This allows the GC to know which memory blocks 
to rescan later.  The problem with incremental GCs is that it's 
difficult to guarantee that the GC will make progress, as frequent 
pointer changes could force the GC start over repeatedly (I read one 
paper that did describe such an incremental GC, but such implementations 
are definately not the norm).  Also, the code injection means pointer 
modifications are far more expensive--from 10 to 100 instructions 
depending on the situation (if my memory serves me).  However, this 
slower average performance is counterbalanced by the elimination of 
"stop the world" collections.  I can see this being a very reasonable 
approach for some realtime applications, but perhaps not as a general 
purpose strategy.

There are a few odd alternatives out there, such as Boehm's "mostly 
parallel" which marks memory blocks as read-only using VMM calls during 
its initial scanning phase (which doesn't stop the world), and then has 
a much shorter "stop the world" phase later on.  However, these memory 
page manipulations are very expensive, so this approach is probably not 
optimal in most situations.

I think an incremental GC could be done for a subset of D without too 
much trouble (ie. any code that uses plain old reference manipulation), 
but being able to pass pointers and arrays of pointers to C functions is 
a stumbling block.  An incremental GC may still be possible, but it 
would need to be over-conservative in how pointer changes are tracked. 
Basically, any time pointer data is passed to an opaque function the 
compiler would have to assume that it could be modified and to signal 
the GC appropriately.

The more liekly situation is that D will always use a conservative GC 
"with hints," so type information is used whenever possible to focus the 
scan phase, but untyped memory allocations will be scanned by default, 
though the user could indicate that this is not necessary.  In Ares, I'm 
going to eventually parameters to malloc/calloc/realloc for this purpose.


Sean



More information about the Digitalmars-d mailing list