Why does this simple test program leak 500MB of RAM?

Steven Schveighoffer schveiguy at gmail.com
Mon Jun 19 15:59:39 UTC 2023


On 6/19/23 6:15 AM, FeepingCreature wrote:
> Consider this program:
> 
> ```
> module test;
> 
> import core.memory : GC;
> import core.sync.semaphore;
> import core.thread;
> import std;
> 
> enum keys = [
>      "foo", "bar", "baz", "whee",
>      "foo1", "foo2", "foo3", "foo4", "foo5", "foo6", "foo7", "foo8",
>      "bar1", "bar2", "bar3", "bar4", "bar5", "bar6", "bar7", "bar8",
>      "baz1", "baz2", "baz3", "baz4", "baz5", "baz6", "baz7", "baz8"];
> 
> void spamGC(Semaphore sema) {
>      void recursionTest(int depth) {
>          if (depth < 1000) {
>              recursionTest(depth + 1);
>              if (depth % 300 == 0)
>                  recursionTest(depth + 1);
>          }
>          string[string] assocArray;
>          static foreach (a; keys)
>          {
>              assocArray[a] = a;
>          }
>          // ?????
>          assocArray.clear;
>          assocArray = null;
>      }
>      recursionTest(0);
>      sema.notify;
>      Thread.sleep(3600.seconds);
> }
> 
> void main() {
>      Thread[] threads;
>      auto sema = new Semaphore(0);
>      enum threadCount = 100;
>      threadCount.iota.each!((_) {
>          auto thread = new Thread({ spamGC(sema); });
>          thread.start;
>          threads ~= thread;
>      });
>      // this clears the leak up!
>      // threads.each!((thread) { thread.join; });
>      threadCount.iota.each!((i) {
>          sema.wait;
>      });
>      writefln!"Done.";
>      100.iota.each!((_) {
>          GC.collect;
>          GC.minimize;
>      });
>      writefln!"Collected.";
> 
>      // Now look at residential memory for the process.
>      Thread.sleep(3600.seconds);
> }
> ```
> 
> We've had problems with excessive memory leaks in JSON decoding in 
> production. So I've put together this test program.
> 
> The `spamGC` recursive call represents recursive JSON decoding.
> 
> The `Thread.sleep` call represents a thread idling in a threadpool.
> 
> After completion, the program sits at ~600MB residential. I believe, 
> from looking at it, it should be obvious that this memory is **entirely 
> dead**.

The `clear` should remove all references to the bucket elements (it 
basically nulls out the bucket elements without deallocating the 
elements or the bucket array). The `assocArray = null` could be 
considered a dead store by the const-folder, so it's possible that isn't 
actually being performed.

> 
> If I alloca 10KB of stack at the beginning of the recursion, the 
> residential RAM drops to ~180MB. I don't think this completely solves 
> the issue, but it sure is interesting.
> 

That is interesting...

So I'm looking at the druntime code. If I understand this correctly, the 
GC doesn't try running a collection when allocating small blocks until 
2.5GB of memory is consumed? that can't be right...

https://github.com/dlang/dmd/blob/17c3f994e845ff0b63d7b5f6443fe5a7fdc08609/druntime/src/core/internal/gc/os.d#L215-L252

https://github.com/dlang/dmd/blob/17c3f994e845ff0b63d7b5f6443fe5a7fdc08609/druntime/src/core/internal/gc/impl/conservative/gc.d#L1949-L1967

But maybe that's only if some other condition is not met. Can you use GC 
profile to determine how many collections run during the affected time? 
I am curious if it is not a problem of the GC not collecting things it 
should collect vs. not collecting at all.

This seems to play a factor, but I don't understand it.

https://github.com/dlang/dmd/blob/17c3f994e845ff0b63d7b5f6443fe5a7fdc08609/druntime/src/core/internal/gc/impl/conservative/gc.d#L1871-L1892

-Steve


More information about the Digitalmars-d mailing list