Pop quiz [memory usage]

Sat Jun 6 12:07:40 PDT 2009

Vladimir Panteleev wrote:
> On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean at invisibleduck.org>
> wrote:
> 
>> Jarrett Billingsley wrote:
>>> On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
>>> Panteleev<thecybershadow at gmail.com> wrote:
>>>> // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
>>>> version(Tango) import tango.io.Console;
>>>> else           import std.stdio;
>>>>
>>>> struct S
>>>> {
>>>>        ubyte[40_000] data;
>>>> }
>>>>
>>>> void main()
>>>> {
>>>>        S[] a;
>>>>        a ~= S();
>>>>
>>>>        // QUESTION: How much memory will this program consume upon 
>>>> reaching
>>>> this point?
>>>>        version(Tango) Cin.copyln();
>>>>        else           readln();
>>>> }
>>>>
>>>  There seems to be something wrong with the newCapacity function that
>>> _d_arrayappendcT calls.  From an element size of 20000 (I halved it
>>> just to make the allocation faster) and an array length of 1, it
>>> somehow calculates the new size to be 266686600.  Hm.  That seems a
>>> bit off.
>>>  It seems this line:
>>>  long mult = 100 + (1000L * size) / log2plus1(newcap);
>>>  is to blame.  I don't think large value types were taken into account
>>> here.  The resulting multiplier is 1,333,433, which is hilariously
>>> large.
>>
>> I've been debating for a while whether newCapacity shoulds exist at 
>> all.   The GC already tends to leave free space at the end of blocks, 
>> so why should the runtime second-guess it?
> 
> Do you mean at the end of pages, or pools?

Blocks.  Blocks less than 4k in the current allocator are sized in 
powers of two, so array appends already get the "double in size" 
behavior in Java even without newCapacity.  newCapacity just has the 
potential to throw an array into the next larger block early, thus 
potentially wasting more space if the array will never be appended to. 
I think newCapacity isn't supposed to reserve extra space for blocks 
larger than 4k, but it's been a while since I've looked at it.

> If you mean that DMD will allocate memory pools larger than the immediate
> memory requirement, that's true, however D's GC is "greedy" in that it
> always allocates memory at the lowest address possible. This means that
> when all previous pools fill up, the next object will "cap" the new array,
> and the array will no longer be able to extend.

This is a quirk of the current GC.  A new GC may not behave the same way 
(in fact I can guarantee that the one I'm working on is implemented 
differently).

> So, I don't think that your idea is feasable with the current GC
> implementation.

I'm not sure I understand.  The only "idea" I proposed was to get rid of 
newCapacity.

> I wonder how much would things get improved if objects
> would be divided between "growing" and "non-growing", with the GC
> prioritizing allocating new objects in free space not directly following
> "growing" objects.

The user already knows best which arrays are going to grow though.  Is 
this really something the runtime should try to figure out?