Problems with GC, trees and array concatenation

Oskar Linde oskar.lindeREM at OVEgmail.com
Mon Jun 4 07:48:35 PDT 2007


Frits van Bommel skrev:
> Babele Dunnit wrote:
>> Oskar Linde Wrote:
>>
>>> The GC is not guaranteed to free all blocks of memory, but the usage 
>>> patterns in most applications make the amount of noncollectable 
>>> memory bound by a constant factor.
>>
> [snip]
>>
>> So, I should say, there is a bug in ARRAYS CONCATENATION, not in 
>> GARBAGE COLLECTION... Array concatenation introduces something in 
>> memory layout so that, later, the garbage collection cannot reclaim 
>> that memory zone...
>>
>>> I'd be interested to hear on what platform the original sample gives 
>>> unbounded memory leakage.
>>
>> Windows XP, DMD 1.014 (point 5 of my original post)
>>
>> From another post:
>>
>>> So in conclusion, I am quite sure the following line:
>>>              indi[] = testPop1.individuals ~ testPop2.individuals;
>>> Is the cause of the memory leak, because it allocates a huge chunk of
>>> memory that is left to the GC to free. The chance of those huge chunks
>>> being hit by a spurious pointer seems quite high.
>>
>> again: if you DO NOT concatenate those two arrays, everything is 
>> properly garbage collected. So, why on earth the GC should be able to 
>> collect the "original" arrays and not the "copied" ones?? And it 
>> happens also with very small vectors (see above), and still I do not 
>> see from where those spurious pointer should come from (holy smoke, 
>> Walter was so kind to make D initialize everything...)
>>
>> Seems to me that DEFINITELY there is something in array concatenation 
>> code which fools the GC - the Windows version of it, I mean...
> 
> Looking at the GC code I can't seem to find any place where arr[length 
> .. _gc.cap(arr)] (the unused part of the array allocation) is 
> initialized. This could explain the issue if your arrays have different 
> lengths (since the data from an longer old array may be present after a 
> shorter new array, and is considered as "live" pointers by the GC 
> because it's within the same allocation block).
> 
> However, this seems to be the case for straight allocation as well, not 
> just concatenation.
> If this is the cause, you probably have the same issue if you replace
>     indi[] = testPop1.individuals ~ testPop2.individuals;
> by
>     auto tmp = new Individual[](testPop1.length + testPop2.length);
>     tmp[0 .. testPop1.length] = testPop1;
>     tmp[testPop1.length .. $] = testPop2;
>     indi[] = tmp;
> Is this the case?

I've come to the same conclusion. The following patch seems to correct 
this:

--- gcx.d       2007-06-04 16:47:02.354590379 +0200
+++ gcx.d.new   2007-06-04 16:46:53.331933006 +0200
@@ -297,7 +297,7 @@
                 gcx.bucket[bin] = (cast(List *)p).next;
                 //memset(p + size, 0, binsize[bin] - size);
                 // 'inline' memset - Dave Fladebo.
-               //foreach(inout byte b; cast(byte[])(p + 
size)[0..binsize[bin] - size]) { b = 0; }
+               foreach(inout byte b; cast(byte[])(p + 
size)[0..binsize[bin] - size]) { b = 0; }
                 //debug(PRINTF) printf("\tmalloc => %x\n", p);
                 debug (MEMSTOMP) memset(p, 0xF0, size);
             }


(Sorry about the line breaks... remove the comment before the foreach line)


The memory leak disappeared on Linux 1.014 with the above change as far 
as I can tell.

/Oskar



More information about the Digitalmars-d mailing list