[D-runtime] Fw: [phobos] newCapacity function in array appending

Fawzi Mohamed fawzi at gmx.ch
Wed Jul 14 07:11:05 PDT 2010


On 14-lug-10, at 15:19, Steve Schveighoffer wrote:

> ----- Original Message ----
>> From: Fawzi Mohamed <fawzi at gmx.ch>
>>
>>
>>> ----- Forwarded Message ----
>>>> From: Steve Schveighoffer  <schveiguy at yahoo.com>
>> [...]
>> I am not sure I follow, you mean using the new function to   
>> determine how many
>> pages really commit?
>> well indeed you would spare some  cycles, but at the cost of using  
>> memory, not
>> sure it is really a win (the use of  pages already
>
> I assume the newCapacity function is used to avoid constantly  
> looking for new
> memory to allocate.  The theory being if you are appending, you  
> don't want to
> just allocate exactly what you need, you want to preallocate a  
> little more.
>
> At least when the current pool is exhausted, newCapacity is used to  
> allocate a
> fresh block, I just don't know why it's not used to extend the  
> current block
> (which is why I'm trying to add that feature).

ok I understand, it makes sense.

> One of the issues with the current GC is that getting a size of a  
> memory block
> that consists of hundreds or thousands of pages is that it is a  
> linear search
> for the end of the block.   This is mitigated by the LRU cache, but  
> once you
> exceed the number of cache, performance should probably fall off a  
> cliff
> (however, it doesn't for reasons I don't understand).  But when I  
> extend a
> block, it must do this same linear search to find the page size  
> before doing an
> extend.  I'm trying to do that linear search as little as possible.   
> All this
> could probably be mitigated if the allocated length was somehow  
> cached per
> allocated page.

I discussed about this with winterwar (wm4), I had proposed some  
enhancements, but it seems that most of these did not really improve  
things much.
(I was arguing for a global array with pool sizes, as I thought that  
it would be better from the caching prespective).
About the linear search a simple way would be to add some extra flags,  
like
PAGE_PLUS_0 PAGE_PLUS_1, PAGE_PLUS_2, PAGE_PLUS_3, PAGE_PLUS_4,...
and then use that to know that if you are at PAGE_PLUS_x you have to  
jump back at least 2^x pages.
with very few bits you can drastically reduce the jumps.
The idea is *not* o encode the exact jump, but just the position of  
the highest bit of the jump.
linear search is fast (and well cached), so you really win only if you  
encode large jumps, thus the exponential growth.
The other option is to encode the exact address (wm4 was rooting for  
that), but I feel that the extra memory used defeats the advantages.

> I didn't write newCapacity, so I'm not sure about the theory behind  
> it, or what
> desirable properties for appending are.  But it does make sense to  
> me that if
> you are continuously appending, you want to preallocate more than  
> "just enough
> to append the new data".  How much more, I have no clue :)  I  
> assumed the
> formula in newCapacity was written by someone who knows what they  
> are doing.
> Well, except for using the element size in the percentage formula.

well the initial formula in tango clearly had a flaw (exploding for  
some arguments), I have reduced it to and interpolation between using
	a/100 percentage more space (geometric growth)
and adding something that grows proportionally to
	x/log2(x)
which is the average storage used by each bit of x (log2(x) is the  
number of bits of x).
x/log2(x) goes to inifinity for x=0, and grows less than just x for  
large x.
In the end I choose a functional form with 2 parameters (a,b, well  
minBits if one really wants, but I think it is not so useful).

> [...]
> The other factors are seemingly arbitrary, so I'm not sure how to  
> tune those.

Indeed I did not know a meaningful way to tune those parameters, so I  
just choose values that looked good and reasonable to me, and called  
it a day :).

Fawzi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/d-runtime/attachments/20100714/487ec218/attachment-0001.html>


More information about the D-runtime mailing list