LRU cache for ~=

Tue Oct 20 06:58:01 PDT 2009

On Mon, 19 Oct 2009 14:51:32 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> I just wrote this to Sean and Walter and subsequently discussed it with  
> Walter. Walter thinks this should work. Does anyone have the time and  
> inclination to test this out? It would involve hacking into druntime's  
> implementation of ~= (I'm not sure what the function name is). I'd  
> really appreciate this; I'm overloaded as it is.
>
> ==================
>
> In wake of the recent demise of T[new], I was thinking of finding ways  
> of making ~= efficient for T[].
>
> Currently ~= is slow because accessing GC.sizeOf(void*) acquires a  
> global lock and generally must figure out a lot of things about the  
> pointer to make a decision.
>
> Also, ~= is dangerous because it allows slices to stomp over other  
> slices.
>
> I was thinking of solving these issues by keeping an LRU (Least Recently  
> Used) cache inside the implementation of ~=. The LRU would only have a  
> few entries (4-8) and would store the parameters of the last ~= calls,  
> and their cached capacities.
>
> So whenever code calls arr ~= b, the LRU is searched first. If the  
> system finds "arr" (both bounds) in the LRU, that means the cached  
> capacity is correct and can solve the matter without an actual trip to  
> the GC at all! Otherwise, do the deed and cache the new slice and the  
> new capacity.
>
> This also solves the lack of safety: if you request a growth on an array  
> you just grew, it's impossible  to have a valid slice beyond that array.
>
> This LRU would allow us to keep the slice API as it currently is, and  
> also at excellent efficiency.
>
> What do you think?
>

This is a very good idea.  Incidentally, you only need the upper bound  
location, the beginning location is irrelevant, since you don't grow  
down.  What do you do in the case where the memory was recycled?  Does a  
GC collection cycle clean out the cache as well?

This is better than my two previous ideas.  The only drawback I see is if  
you have many threads doing appending, or you are appending more than 8  
arrays at once in a round-robin fashion, you would lose all the benefit  
(although it shouldn't affect correctness).  At that point however, you'd  
have to ask yourself why you aren't using a specialized appender type or  
function.

In response to other's queries about how many LRUs to use, you'd probably  
want one per heap, and you'd want to lock/not lock based on whether the  
heap is thread local or not.

-Steve