The problem with the D GC

Tue Jan 9 14:28:20 PST 2007

Kyle Furlong wrote:

> Oskar Linde wrote:
>> Oskar Linde wrote:
>>> After having fought a while with D programs with runaway memory leaks,
>>> I've unfortunately had to come to the conclusion that the D GC is not
>>> ready for production use. The problem is what I'd call "spurious
>>> pointers". That is random data (strings, numbers, image data, audio or
>>> whatever) appearing to the GC to be full of pointers to all over the
>>> memory space.
>> 
>> I feel kind of bad for making it sound like this is a problem related
>> specifically to the D garbage collector. It is rather a general and
>> apparently well known problem of all conservative garbage collectors.
>> The D garbage collector is still excellent for a large domain of
>> problems.
>> 
>> Lots of people seem to be having similar problems though, so a better
>> understanding of under what conditions a conservative garbage collector
>> will and will not work seems to be called for.
>> 
>> I'd guess that a lot of problems we have with the conservative GC is
>> because we are at the end of the life time of the 32-bit address space.
>> As time has passed, our programs have used a larger and larger portion
>> of the available memory space. As one anonymous %u said, one solution
>> would be "increasing the amount of memory needed for pointers by the
>> factor two", which I interpret as moving to a 64-bit memory space.
>> 
>> I did some calculations to better understand what is going on. I've
>> probably done several errors though.
>> 
>> <theoretical rambling>
>> The probability of one single uniformly random spurious pointer
>> preventing the collection of an object of size s is
>> 
>> p = s/B,
>> 
>> with B being the size of the address space (2^b for b = 32 or 64 bits).
>> On a 64 bit architecture, the probability is obviously drastically
>> reduced.
>> 
>> if g is the amount (bytes) of independent random data an application
>> needs, the risk of an object of size s not being collected is
>> 
>> P(g) = 1 - (1 - s/B) ^ (g/(b/8))
>> 
>> As one can see, the risk increases considerably as the objects gets
>> bigger. Increasing the amount of random data the application contains,
>> reduces the size of objects the GC will be able to handle satisfactory.
>> 
>> My example was kind of nasty in that it not only accumulated additional
>> random garbage in each iteration, but also caused heap fragmentation. A
>> kinder example would always create objects of the same size. The simpler
>> example would be:
>> 
>> * start with a static set of g bytes of random data
>> * for each iteration create n objects of size s (for simplicity, let
>> each object contain s bytes of random data)
>> * after each iteration, none of the n objects are used anymore. This is
>> a good time to call gc.fullCollect();
>> 
>> After each such iteration, i, P_i*n_i objects will remain uncollected,
>> and will be added to the pool of random spurious pointers. I disregard
>> the small portion of objects that are uncollected because of spurious
>> pointers appearing in other objects left uncollected in the same
>> iteration. If this portion would be significant, you'd really have a
>> problem anyway and this will show up in later iterations.
>> 
>> g_{i+1} ~ g_i + P_i*n_i*s.
>> 
>> I generously assume that the safe memory area of the collected objects
>> are reused by the allocator in the next iteration. A few of those will
>> become unsafe by the recently added spurious pointers from uncollected
>> objects, but for the most part, those memory areas are now safe, and the
>> objects allocated/collected from them can be disregarded. The number of
>> objects that need to be allocated in unsafe memory areas for the next
>> iteration becomes:
>> 
>> n_{i+1} = P_i*n_i + P(P_i*n_i*s)*(n_0 - n_i)
>> 
>> This will of course not perfectly predict the real behavior, as it
>> relaxes the problem away from discrete units, but should still
>> adequately model the sample application. Using this, I tried to find the
>> breaking point, i.e. at what size objects need to be explicitly freed
>> instead of left to the GC.
>> 
>> "Verification":
>> Rewriting my original example to to a std.gc.fullCollect() every 10th
>> iteration, giving the following parameters:
>> g = 20 MB
>> n = 10
>> 
>> the model suggests allocating objects of size:
>> s = 2000 would result in almost no space overhead, stable at 20 mb
>> s = 3000 would result in a slight but stable space overhead,
>> s = 4000 would cause a run-away memory usage
>> 
>> running the program results in:
>> s = 2000 memory usage is stable at 20 mb
>> s = 3000 results in a small apparently unbounded memory leak
>> s = 4000 results in a unbounded memory leak
>> 
>> The model appears to be at least in the correct ballpark for this sample.
>> 
>> Theoretical results:
>> Those tables show the maximum object size the GC will be able to handle
>> with different static amounts of "random" data. By "being able to
>> handle", I mean that the application doesn't use more than 2x the
>> required amount of memory. The exact breaking point is very unstable and
>> going above it rapidly results in uncontrolled memory consumption.
>> 
>> 32 bit arch, 100 objects
>> 
>> 1 MB data    8000 bytes
>> 5 MB data    4500 bytes
>> 10 MB data    3000 bytes
>> 20 MB data    2000 bytes
>> 100 MB data     700 bytes
>> 500 MB data     200 bytes
>> 1000 MB data     100 bytes
>> 
>> 32 bit arch, 1000 objects
>> 
>> 1 MB data    3400 bytes
>> 5 MB data    2200 bytes
>> 10 MB data    1700 bytes
>> 20 MB data    1200 bytes
>> 100 MB data     500 bytes
>> 500 MB data     150 bytes
>> 1000 MB data     100 bytes
>> 
>> 32 bit arch, 10000 objects
>> 1 MB data    1300 bytes
>> 5 MB data    1000 bytes
>> 10 MB data     800 bytes
>> 20 MB data     600 bytes
>> 100 MB data     300 bytes
>> 500 MB data     100 bytes
>> 1000 MB data      75 bytes
>> 
>> Those figures need to be taken with a couple of grains of salt, but
>> should give a indication of at what object size one needs to manually
>> handle object lifetimes.
>> 
>> As a comparison -- the 64 bit haven:
>> 
>> 64 bit arch, 100 objects
>> 2 GB data    1.5 GB
>> 100 GB data    600 MB
>> 1 TB data    200 MB
>> 
>> 64 bit arch, 1000 objects
>> 2 GB data    350 MB
>> 100 GB data    250 MB
>> 1 TB data    150 MB
>> 
>> 64 bit arch, 10000 objects
>> 2 GB data    100 MB
>> 100 GB data     75 MB
>> 1 TB data     50 MB
>> </theoretical rambling>
>> 
>> Summary:
>> 
>> As far as I can see, what I have to do to avoid memory leaks with a
>> conservative GC, is one of the following:
>> 
>> 1. move to a 64 bit architecture
>> 2. manually handle all objects larger than a few hundred bytes (see
>> above) 3. hide all non pointer data from the GC
>> 
>> It is a shame #2 is such a pain and that D doesn't offer any help such
>> as automatic ref-counting. Not having automatic ref-counting also
>> prevents neat solutions such as transparent CoW, and automatic handling
>> of scarce resources.
>> 
>> If I wouldn't have a strong belief that automatic ref-counting would be
>> addressed soon, I'd definitely consider going back to C++. Luckily,
>> before I'd give up waiting, 64 bit architectures will probably be in
>> great majority. ;)
>> 
>> /Oskar
> 
> Dont hold your breath for a reference counting implementation from DMD.
> Walter doesnt want to add any overhead to assignments.

Wouldn't it be possible to have classes declared "refcounted" in the same
way as scope? Then the extra cost would only apply to members of thouse
classes, possibly by disallowing uppcasts to non refcounted superclasses. 

example:

refcounted class foo{
}

...

void bar(){
        foo f=new foo();//refcount is one
        auto k=f;//refcount is two
        k=null;//back to one
        //here the refcount reaches zero and the object is deleted
}