Quantifying the Performance of Garbage Collection vs. Explicit

Dan murpsoft at hotmail.com
Mon Mar 17 16:59:22 PDT 2008


BCS Wrote:

> Craig Black wrote:
> > 
> > If I understand the article correctly, I think you are missing the main 
> > point of it.  It is not merely saying ,"We ran conclusive test: GC loses 
> > to explicit memory management!"  They went much further than that, and 
> > identified the primary bottleneck in modern GC implementations.  Namely, 
> > that GC doesn't cooperate well with virtual memory.  GC typically scans 
> > way more memory than the rest of the application, and consequently 
> > causes the most cache misses/page faults.  This article is not bashing 
> > GC.  I think it is indicating to us how we can best improve it.  Perhaps 
> > some GC algorithm could be developed that is optimized specifically to 
> > produce the fewer page faults and cache misses.

I typically use GC, but when I don't it gets linked anyways.  Bloat.

As for preventing cache/page misses...

Some blocks will be in cache (a), some pages will be in memory (b), and some will be on swap (c).
Set (b) and set (c) can be differentiated by OS accounting.  
(a) is a subset of (b) does not overlap (c).
Set (a) and set (b) cannot be differentiated by any explicit accounting.

- Any system which allowed you to not analyze memory stored on swap will definitely improve performance.

- Performing a cache miss causes the requested data not to be available for a guestimated 100 cycles.  Prefetching batches of memory that you want to analyze, and analyzing it as it actually arrives means you could dramatically reduce cache miss overhead.

- Not allowing pointers to be implicitly cast from other types might allow you to store pointers in the GC to all pointers, and store simple patterns for arrays of structs containing pointers and arrays of pointers.  This could let you skip any pages that clearly don't have pointer information.

This would become sensible if pointers were treated as a typedef'd fully qualified integer type (with shift, add etc working on them), and an alias of '*'.  As we found with C, allowing one type to be used to mean two different things is bad, so I would recommend a p32 and a p64.

In fact, causing all pointers and array {ptr,length} structs to be colocated where possible in a program would *dramatically* improve GC performance as only a few pages would ever need to be brought into cache during a GC phase - the same that would probably already be in cache during normal program operation.  

This is probably already true; but the effect would be more dramatic if the GC didn't need to analyze the contents of a pointer because it knew the contents themselves didn't contain another pointer?

Best Regards,
Dan



More information about the Digitalmars-d mailing list