Memory issues. GC not giving back memory to OS?

Steven Schveighoffer schveiguy at gmail.com
Tue Apr 21 20:29:37 UTC 2020


On 4/21/20 2:31 PM, Cristian Becerescu wrote:

> I then performed a simple test where I incrementally appended 2^30 
> integers (4GB) to a dynamic array (memory measurements are the same for 
> Appender).
> -> Memory used (peak; increasing towards the end of execution): ~7GB
> -> capacity == 1.107 * size (at the end of the program)
> 
> This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the peak 
> memory consumption was 7GB. Apparently, the GC can correctly collect the 
> memory when manually calling collect() at the end of appending, but that 
> memory (we are talking 7 - 4.4 = 2.6GB) is never given back to the 
> system. At least this is our intuition after making those observations.

The GC doesn't automatically give back memory to the OS. And it really 
can't. There's a GC.minimize function, but that is only going to release 
memory to the OS that can be released. It highly depends on the 
implementation and the mechanism the OS gives to access memory.

So for example, if all the "free" memory is in the middle of the 
OS-provided memory segment, then it can't give it back.

> 
> I have created a gist with the test code and results (thanks Edi for 
> augmenting the test code to profile the GC): 
> https://gist.github.com/cbecerescu/e6606a8530c56ae06c52e5b1cd32b31f
> 
> Just some notes:
> - if reserving 2^30 elements for the array (or Appender) beforehand, 
> memory peaks are at 4GB

Right, because it will never reallocate, it just grows within the 
original memory block. This is what I'd recommend for something like this.

If you don't reserve, then as it grows, it needs a bigger and bigger 
segment.

And it's not always going to reuse memory that you already used on your 
way up. Why? Because it can't get a contiguous segment that is free and 
fits the new requirement. It does try extending in-place if it can, but 
once it can't, that memory is not usable because the segment is too 
small to fit your massive data.

But I'd say that the stats you are printing are a bit puzzling. Why does 
it all of a sudden allow you to collect at the end when it didn't 
before? It does seem like your output doesn't match your example code. 
But there are a number of reasons why the GC may not do what you are 
expecting, including possible bugs in the GC.

> - C++'s std::vector, without reservation, never gets beyond 4GB and has 
> size == capacity at the end

C++ frees the original memory immediately when growing. So it's going to 
be more memory efficient. You are never going to match a manually 
managed memory efficiency in terms of space used with a GC.

-Steve


More information about the Digitalmars-d mailing list