LRUCache - a simple least recently used cache

Sun Nov 4 15:22:07 PST 2007

Charles D Hixson wrote:
> renoX wrote:
>> Charles D Hixson a écrit :
>>> renoX wrote:
>>>> Charles D Hixson a écrit :
>>>>> In the hopes that someone find this useful:
>>>>> http://www.prowiki.org/wiki4d/wiki.cgi?CharlesHixson
>>>>>
>>>>> This particular version presumes D2.x code, but I think the 
>>>>> adjustments are trivial for D1.x.
>>>>
>>>> I think that there is a flaw in your code: the access counter is a 
>>>> ushort (16bit value) and is incremented each time you access an 
>>>> element in the cache: it can overflow quickly and will wrap around..
>>>>
>>>> This means that recently used elements could have a _nodeId smaller 
>>>> than nodes used less recently, which means that the overflow 
>>>> handling code will drop the wrong elements..
>>>>
>>>> This may not be the case in your usage, but this is either a bug or 
>>>> at least a pretty big restriction in the usage which should be 
>>>> clearly indicated.
>>>>
>>>>
>>>> Regards,
>>>> renoX
>>>
>>> Well, two things:
>>> 1)  This code isn't really adapted for large sets of data.  If you 
>>> want that, you'd better hand optimize it for your intended application.
>>
>> I'm not talking about a large set of data, just about a program which 
>> runs for a long time and/or which make heavy usage of the data in the 
>> cache: each call to 'contains' or 'opIndex' increase maxNode so it 
>> could overflow quite quickly (an ushort isn't that big).
>>
>>> Any suggestions for a quick and simple way to do this?
>>
>> Just make the access counter 32bit, it should be enough for a 'normal' 
>> program (not an Apache style program though).
>>
>>> Would it be reasonable to arbitrarily renumber everything in the 
>>> cache serially (i.e., without considering the current nodeIds) 
>>> whenever the maximum nodeId started to approach ushort.max?  (That's 
>>> the direction that I'm leaning in...but I haven't decided whether or 
>>> not it's reasonable.)
>>
>> I'm not an expert in D's but I think that an array of
>> struct Node{ Key key; Body bdy; ushort _nodeId; } or an array of
>> struct Node{ Key key; Body bdy; uint   _nodeId; } have exactly the 
>> same size due to the padding..
>>
>> And maybe on x86-64, you could use ulong without increasing the size 
>> of the array..
>>
>> Could someone confirm/infirm?
>> For me, key and bdy are reference, so they have the same size as 
>> pointers, am I right?
>>
>> Regards,
>> renoX
>>
>>
> Well, they aren't necessarily pointers/references.  Look at the test 
> examples.  OTOH, they easily COULD be.
> 
> I've done a slight redesign (not yet posted) that should answer your 
> objections (I decided to renumber when necessary, but I made NodeId a 
> type, so it's easy for anyone to change if they want to).
> 
> OTOH, Body MIGHT be rather large, so I feel that I need to split things 
> up so that foreach won't necessarily copy the entire structure when only 
> an access to the nodeId, e.g., is needed.
OK.
Version 0.2 has now been posted.  I switched from using 
structures to using arrays.  It's now on a page linked off my 
main page, because I wasn't able to get changes to the main 
page to stick.
http://www.prowiki.org/wiki4d/wiki.cgi?LruCache

I still haven't tested the renumbering...that will have to 
wait for something that uses it for an extended period of 
time, but the code is in there, if you need it.  (It's also 
easy to change the NodeId type from ushort to ulong if that's 
your preference.)