Pop quiz [memory usage]

Sat Jun 6 14:47:33 PDT 2009

On 2009-06-06 21:07:40 +0200, Sean Kelly <sean at invisibleduck.org> said:

> Vladimir Panteleev wrote:
>> On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean at invisibleduck.org>
>> wrote:
[...]
>> 
>>> I've been debating for a while whether newCapacity shoulds exist at 
>>> all.   The GC already tends to leave free space at the end of blocks, 
>>> so why should the runtime second-guess it?
>> 
>> Do you mean at the end of pages, or pools?
> 
> Blocks.  Blocks less than 4k in the current allocator are sized in 
> powers of two, so array appends already get the "double in size" 
> behavior in Java even without newCapacity.  newCapacity just has the 
> potential to throw an array into the next larger block early, thus 
> potentially wasting more space if the array will never be appended to. 
> I think newCapacity isn't supposed to reserve extra space for blocks 
> larger than 4k, but it's been a while since I've looked at it.

at the moment the behavior is exactly the opposite (leaving the small 
array to the GC handling you just sketched)), only array larger than a 
page have the special treatment, I think that the idea is that 
appending to large arrays can be potentially very expensive, so those 
get the special treatment.
But as I said previously I don't fully understand the rationale, 
especially behind the idea of having more reserved space if the 
elements are larger, I could understand having the reserved space just 
referring to the elements, like this:

            long mult = 100 + 200L / log2plus2(newlength)
instead of
            long mult = 100 + 1000L / log2plus2(newcap)

or something in between

these give at most 1.5 or 1.71, so a waste of at most 100% or 71% and 
decreases as 1/log toward 1.02.

To really test this one should benchmark some "typical" applications.

The current choice is neither of these, I don't know exactly why this 
high weight to the high size elements. I think it is an error, but 
maybe there was an idea (at least for smallish sizes), that I don't see.

Fawzi

> 
>> If you mean that DMD will allocate memory pools larger than the immediate
>> memory requirement, that's true, however D's GC is "greedy" in that it
>> always allocates memory at the lowest address possible. This means that
>> when all previous pools fill up, the next object will "cap" the new array,
>> and the array will no longer be able to extend.
> 
> This is a quirk of the current GC.  A new GC may not behave the same 
> way (in fact I can guarantee that the one I'm working on is implemented 
> differently).
> 
>> So, I don't think that your idea is feasable with the current GC
>> implementation.
> 
> I'm not sure I understand.  The only "idea" I proposed was to get rid 
> of newCapacity.
> 
>> I wonder how much would things get improved if objects
>> would be divided between "growing" and "non-growing", with the GC
>> prioritizing allocating new objects in free space not directly following
>> "growing" objects.
> 
> The user already knows best which arrays are going to grow though.  Is 
> this really something the runtime should try to figure out?