O(N) Garbage collection?

Ulrik Mikaelsson ulrik.mikaelsson at gmail.com
Sat Feb 19 09:50:12 PST 2011


Just a thought; I guess the references to the non-GC-scanned strings
are held in GC-scanned memory, right? Are the number of such
references also increased linearly?

I'm not a GC-expert, but if so, wouldn't that pretty much force the GC
to do at least one follow-up of every reference, before realizing it's
pointing to non-GC memory? That COULD explain the linear increase.

That said, I too feel some improvements in the border between GC and
other resource-managements methods are needed. The prospect of good
GC/non-GC combinations was what drew me here in the first place. I
would welcome some clear language and runtime-support for
non-GC-memory, such as frameworks for ref-counting and
tree-allocation, and well-defined semantics of object lifetime both in
GC and non-GC mode. I've mostly sorted out the kinks of the myself (I
think), but it's proving to be _very_ difficult, mostly undocumented,
and often appears to be an afterthought rather than by-design.

Regards
/ Ulrik

2011/2/19 dsimcha <dsimcha at yahoo.com>:
> I've been trying out D's new 64-bit compiler and a serious barrier to using
> it effectively seems to be abysmal garbage collection performance with large
> heaps. It seems like the time for a garbage collection to run scales
> linearly with the size of the heap *even if most of the heap is marked as
> NO_SCAN*.  I'm running a program with a heap size of ~6GB, almost all of
> which is strings (DNA sequences), which are not scanned by the GC.  It's
> spending most of its time in GC, based on pausing it every once in a while
> in GDB and seeing what's at the top of the stack.
>
> Here's a test program and the results for a few runs.
>
> import std.stdio, std.datetime, core.memory, std.conv;
>
> void main(string[] args) {
>    if(args.length < 2) {
>        stderr.writeln("Need size.");
>        return;
>    }
>
>    immutable mul = to!size_t(args[1]);
>    auto ptr = GC.malloc(mul * 1_048_576, GC.BlkAttr.NO_SCAN);
>
>    auto sw = StopWatch(autoStart);
>    GC.collect();
>    immutable msec = sw.peek.msecs;
>    writefln("Collected a %s megabyte heap in %s milliseconds.",
>        mul, msec);
> }
>
> Outputs for various sizes:
>
> Collected a 10 megabyte heap in 1 milliseconds.
> Collected a 50 megabyte heap in 4 milliseconds.
> Collected a 200 megabyte heap in 16 milliseconds.
> Collected a 500 megabyte heap in 41 milliseconds.
> Collected a 1000 megabyte heap in 80 milliseconds.
> Collected a 5000 megabyte heap in 397 milliseconds.
> Collected a 10000 megabyte heap in 801 milliseconds.
> Collected a 30000 megabyte heap in 2454 milliseconds.
> Collected a 50000 megabyte heap in 4096 milliseconds.
>
> Note that these tests were run on a server with over 100 GB of physical RAM,
> so a shortage of physical memory isn't the problem.  Shouldn't GC be O(1)
> with respect to the size of the unscanned portion of the heap?
>


More information about the Digitalmars-d mailing list