Large Arrays and GC

Sean Kelly sean at invisibleduck.org
Thu May 8 08:38:07 PDT 2008


Well whatever's going on, it's not tracked by gc_stats().  I modified 
the previous functions slightly to ensure that both allocated exactly 
the same amount of memory, then verified this via a printf inside the GC 
code--malloc calls were for 60000000 bytes:

void test() {
     void* stuff = (new byte[15_000_000 * 4 - 1]).ptr;
}

void test() {
     void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN);
}

Output using gc_stats() on each iteration after the collect was 
identical for both function calls after things had stabalized:

*** stats ***
poolsize: 360251392
usedsize: 640
freeblocks: 56
freelistsize: 7552
pageblocks: 6

What I don't understand is why Task Manager is reporting the GC.malloc 
version using only 18 megs while the other uses 370, since everything 
I've looked at internally so far suggests they should be the same. 
Perhaps gc_stats() isn't reporting something properly--I really haven't 
given that routine a very close look.

Sean Kelly wrote:
> I'm not sure what to say.  This sample works fine with D 1.0 using 
> Tango, though the memory usage is strangely high at around 370 megs. 
> Here's the weird thing, I tried running these two versions of the test 
> function for comparison:
> 
> void test() {
>     void* stuff = (new byte[15_000_000 * 4]).ptr;
> }
> 
> void test() {
>     void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN);
> }
> 
> The first one uses a stable 370 megs of memory and the second a stable 
> 18 megs.  Obviously there's something weird going on with how arrays are 
> handled.
> 
> dsimcha wrote:
>> After further testing, I've found an exact threshold for this bug.  
>> When an array
>> of uints gets to 48_693_248 bytes (12_173_312 elements) this problem 
>> occurs, after
>> 26 iterations at the threshold, or less for larger arrays.  Anything 
>> below that,
>> even one element smaller, and memory usage is stable over at least 
>> hundreds of
>> iterations.  It appears that the number of bytes is the key, since a 
>> ulong[] will
>> allow the same number of bytes (1/2 the elements) before causing 
>> problems, and a
>> ushort[] will allow twice as many elements (same number of bytes) without
>> crashing.  Furthermore, using equivalent sized floats instead of ints 
>> (float
>> instead of uint, double instead of ulong) or using signed ints, has no 
>> effect.
>>
>> == Quote from dsimcha (dsimcha at yahoo.com)'s article
>>> Because of some difficulties encountered using D for large arrays 
>>> (See previous
>>> posts about array capacity fields), I produced the following test 
>>> case that seems
>>> to be a reproducible bug in D 2.0.13.  The following program keeps 
>>> allocating a
>>> huge array in a function and then letting all references to this 
>>> array go out of
>>> scope.  This should result in the array being freed as soon as more 
>>> memory is
>> needed.
>>> import std.stdio, std.gc;
>>> void main(){
>>>     uint count=0;
>>>     while(true) {
>>>         test();
>>>         fullCollect();
>>>         writefln(++count);
>>>     }
>>> }
>>> void test() {
>>>     uint[] stuff=new uint[15_000_000];
>>> }
>>> This produced an out of memory error after 21 iterations on my 
>>> machine w/ 2 GB of
>>> RAM.  Using an array size of 10_000_000 instead of 15_000_000, its 
>>> memory usage
>>> stabilized at 350 megs, which seems rather large since a 
>>> uint[10_000_000] should
>>> only use 40 megs, plus maybe another 40 for overhead.  Furthermore, 
>>> it takes
>>> several iterations for the memory usage to reach this level.  Using a 
>>> larger array
>>> size, such as 100_000_000, made this test run out of memory after 
>>> even fewer
>>> iterations.  Furthermore, changing from a uint[] to real[] with a 
>>> size of
>>> 15_000_000 made it run out after 8 iterations instead of 21.
>>



More information about the Digitalmars-d mailing list