The problem with the D GC

Sean Kelly sean at f4.ca
Tue Jan 9 08:06:21 PST 2007


Bill Baxter wrote:
> Here's a slightly less contrived version of Oskar's gc test.
> 
> import std.math;
> import std.random;
> import std.stdio;
> 
> void main() {
>     // The real memory use, ~40 mb
>     double[] data;
>     data.length = 5_000_000;
>     foreach(i, inout x; data) {
>         x = sin(cast(double)i/data.length);
>         //x = 1;
>     }
>     int count = 0;
>     int gcount = 0;
>     while(1) {
>         // simulate reading a few kb of data
>         double[] incoming;
>         incoming.length = 1000 + rand() % 5000;
>         foreach(i, inout x; incoming) {
>             x = sin(cast(double)i/incoming.length);
>             //x = 5;
>         }
>         // do something with the data...
> 
>         // print status message every so often
>         count += incoming.length;
>         if (count > 1_000_000) {
>             count = 0;
>             gcount++;
>             writefln("%s processed", gcount);
>         }
>     }
> }
> 
> 
> 
> This one uses doubles instead of uints and the data is the sin of some 
> number.  These are _very_ realistic values for numeric data to have. The 
> same effect can be seen.  Instead of hovering around 40MB, the memory 
> use grows and grows and performance slows and slows.
> 
> This seems to be a very big issue.  The GC seems to be pretty much 
> useless right now if you're going to have a lot of floating point data 
> in your app.

For what it's worth, I ran the test above with the modified GC in Tango, 
for 10000 iterations of the "while(1)" loop.  The default behavior 
roughly matched Phobos, with an 89 second run time and over 340MB of 
memory consumed and growing steadily.  Then I told the GC to not scan 
the arrays using the following calls:

     gc.setAttr( data.ptr, GC.BlkAttr.NO_SCAN );
     gc.setAttr( incoming.ptr, GC.BlkAttr.NO_SCAN );

A test with these changes in place dropped the run time to 7 seconds 
with 43MB of memory consumed and not growing.

I grant that this isn't quite as nice as if D just figured out whether 
to scan the block using TypeInfo, but at least it grants the programmer 
a way to customize GC behavior somewhat to tune application performance. 
  The only stipulation with the current implementation is that block 
attributes will not be preserved if an array is resized.  This isn't 
terribly difficult to fix, but I haven't done so yet.


Sean



More information about the Digitalmars-d mailing list