LRU cache for ~=

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Mon Oct 19 12:22:52 PDT 2009


dsimcha wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
>> I just wrote this to Sean and Walter and subsequently discussed it with
>> Walter. Walter thinks this should work. Does anyone have the time and
>> inclination to test this out? It would involve hacking into druntime's
>> implementation of ~= (I'm not sure what the function name is). I'd
>> really appreciate this; I'm overloaded as it is.
>> ==================
>> In wake of the recent demise of T[new], I was thinking of finding ways
>> of making ~= efficient for T[].
>> Currently ~= is slow because accessing GC.sizeOf(void*) acquires a
>> global lock and generally must figure out a lot of things about the
>> pointer to make a decision.
>> Also, ~= is dangerous because it allows slices to stomp over other slices.
>> I was thinking of solving these issues by keeping an LRU (Least Recently
>> Used) cache inside the implementation of ~=. The LRU would only have a
>> few entries (4-8) and would store the parameters of the last ~= calls,
>> and their cached capacities.
>> So whenever code calls arr ~= b, the LRU is searched first. If the
>> system finds "arr" (both bounds) in the LRU, that means the cached
>> capacity is correct and can solve the matter without an actual trip to
>> the GC at all! Otherwise, do the deed and cache the new slice and the
>> new capacity.
>> This also solves the lack of safety: if you request a growth on an array
>> you just grew, it's impossible  to have a valid slice beyond that array.
>> This LRU would allow us to keep the slice API as it currently is, and
>> also at excellent efficiency.
>> What do you think?
>> Andrei
> 
> 1.  I assume the LRU cache would be thread-local, so no lock would be necessary?

Yes.

> 2.  I don't understand how this solves the safety problem:

It's rather subtle. I'd figured it out in the morning and forgot it by 
the time I explained to Walter.

> // foo lives on the heap b/c we've idup'd it.
> string foo = "This is only a test.".idup;
> string bar = foo[0..4];
> bar ~= " is _not ";
> writeln(foo);  // prints "This is _not a test."

This is actually one of the less subtle cases. Upon the call to ~=, bar 
is not in the LRU cache so ~= conservatively reallocates it.

As a rule of thumb (which takes care of the subtler issues): if a 
growing slice is not found in the LRU cache, it will always be 
conservatively reallocated. This is exactly because you don't know 
whether that slice was reduced from a larger slice.

> Having access to the capacity in an LRU cache doesn't help if I understand it
> correctly.
> 
> 3.  I'm pretty familiar with these parts of druntime and would probably be willing
> to do the implementation after I understand a few details of it a little better.

That would be awesome!!!


Andrei



More information about the Digitalmars-d mailing list