The problem with the D GC
Oskar Linde
oskar.lindeREM at OVEgmail.com
Tue Jan 9 11:09:15 PST 2007
Oskar Linde wrote:
> After having fought a while with D programs with runaway memory leaks,
> I've unfortunately had to come to the conclusion that the D GC is not
> ready for production use. The problem is what I'd call "spurious
> pointers". That is random data (strings, numbers, image data, audio or
> whatever) appearing to the GC to be full of pointers to all over the
> memory space.
I feel kind of bad for making it sound like this is a problem related
specifically to the D garbage collector. It is rather a general and
apparently well known problem of all conservative garbage collectors.
The D garbage collector is still excellent for a large domain of problems.
Lots of people seem to be having similar problems though, so a better
understanding of under what conditions a conservative garbage collector
will and will not work seems to be called for.
I'd guess that a lot of problems we have with the conservative GC is
because we are at the end of the life time of the 32-bit address space.
As time has passed, our programs have used a larger and larger portion
of the available memory space. As one anonymous %u said, one solution
would be "increasing the amount of memory needed for pointers by the
factor two", which I interpret as moving to a 64-bit memory space.
I did some calculations to better understand what is going on. I've
probably done several errors though.
<theoretical rambling>
The probability of one single uniformly random spurious pointer
preventing the collection of an object of size s is
p = s/B,
with B being the size of the address space (2^b for b = 32 or 64 bits).
On a 64 bit architecture, the probability is obviously drastically reduced.
if g is the amount (bytes) of independent random data an application
needs, the risk of an object of size s not being collected is
P(g) = 1 - (1 - s/B) ^ (g/(b/8))
As one can see, the risk increases considerably as the objects gets
bigger. Increasing the amount of random data the application contains,
reduces the size of objects the GC will be able to handle satisfactory.
My example was kind of nasty in that it not only accumulated additional
random garbage in each iteration, but also caused heap fragmentation. A
kinder example would always create objects of the same size. The simpler
example would be:
* start with a static set of g bytes of random data
* for each iteration create n objects of size s (for simplicity, let
each object contain s bytes of random data)
* after each iteration, none of the n objects are used anymore. This is
a good time to call gc.fullCollect();
After each such iteration, i, P_i*n_i objects will remain uncollected,
and will be added to the pool of random spurious pointers. I disregard
the small portion of objects that are uncollected because of spurious
pointers appearing in other objects left uncollected in the same
iteration. If this portion would be significant, you'd really have a
problem anyway and this will show up in later iterations.
g_{i+1} ~ g_i + P_i*n_i*s.
I generously assume that the safe memory area of the collected objects
are reused by the allocator in the next iteration. A few of those will
become unsafe by the recently added spurious pointers from uncollected
objects, but for the most part, those memory areas are now safe, and the
objects allocated/collected from them can be disregarded. The number of
objects that need to be allocated in unsafe memory areas for the next
iteration becomes:
n_{i+1} = P_i*n_i + P(P_i*n_i*s)*(n_0 - n_i)
This will of course not perfectly predict the real behavior, as it
relaxes the problem away from discrete units, but should still
adequately model the sample application. Using this, I tried to find the
breaking point, i.e. at what size objects need to be explicitly freed
instead of left to the GC.
"Verification":
Rewriting my original example to to a std.gc.fullCollect() every 10th
iteration, giving the following parameters:
g = 20 MB
n = 10
the model suggests allocating objects of size:
s = 2000 would result in almost no space overhead, stable at 20 mb
s = 3000 would result in a slight but stable space overhead,
s = 4000 would cause a run-away memory usage
running the program results in:
s = 2000 memory usage is stable at 20 mb
s = 3000 results in a small apparently unbounded memory leak
s = 4000 results in a unbounded memory leak
The model appears to be at least in the correct ballpark for this sample.
Theoretical results:
Those tables show the maximum object size the GC will be able to handle
with different static amounts of "random" data. By "being able to
handle", I mean that the application doesn't use more than 2x the
required amount of memory. The exact breaking point is very unstable and
going above it rapidly results in uncontrolled memory consumption.
32 bit arch, 100 objects
1 MB data 8000 bytes
5 MB data 4500 bytes
10 MB data 3000 bytes
20 MB data 2000 bytes
100 MB data 700 bytes
500 MB data 200 bytes
1000 MB data 100 bytes
32 bit arch, 1000 objects
1 MB data 3400 bytes
5 MB data 2200 bytes
10 MB data 1700 bytes
20 MB data 1200 bytes
100 MB data 500 bytes
500 MB data 150 bytes
1000 MB data 100 bytes
32 bit arch, 10000 objects
1 MB data 1300 bytes
5 MB data 1000 bytes
10 MB data 800 bytes
20 MB data 600 bytes
100 MB data 300 bytes
500 MB data 100 bytes
1000 MB data 75 bytes
Those figures need to be taken with a couple of grains of salt, but
should give a indication of at what object size one needs to manually
handle object lifetimes.
As a comparison -- the 64 bit haven:
64 bit arch, 100 objects
2 GB data 1.5 GB
100 GB data 600 MB
1 TB data 200 MB
64 bit arch, 1000 objects
2 GB data 350 MB
100 GB data 250 MB
1 TB data 150 MB
64 bit arch, 10000 objects
2 GB data 100 MB
100 GB data 75 MB
1 TB data 50 MB
</theoretical rambling>
Summary:
As far as I can see, what I have to do to avoid memory leaks with a
conservative GC, is one of the following:
1. move to a 64 bit architecture
2. manually handle all objects larger than a few hundred bytes (see above)
3. hide all non pointer data from the GC
It is a shame #2 is such a pain and that D doesn't offer any help such
as automatic ref-counting. Not having automatic ref-counting also
prevents neat solutions such as transparent CoW, and automatic handling
of scarce resources.
If I wouldn't have a strong belief that automatic ref-counting would be
addressed soon, I'd definitely consider going back to C++. Luckily,
before I'd give up waiting, 64 bit architectures will probably be in
great majority. ;)
/Oskar
More information about the Digitalmars-d
mailing list