LRU cache for ~=
Steven Schveighoffer
schveiguy at yahoo.com
Tue Oct 20 07:05:42 PDT 2009
On Mon, 19 Oct 2009 22:37:26 -0400, dsimcha <dsimcha at yahoo.com> wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s
> article
>> dsimcha wrote:
>> > == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s
>> article
>> >> dsimcha wrote:
>> >>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s
>> article
>> >>>> dsimcha wrote:
>> >>>>> Started playing w/ the implementation a little and I see a
>> problem. What
> about
>> >>>>> the garbage collector? There are two possibilities:
>> >>>> [snip]
>> >>>>> The only possible solutions I see would be to have the GC know
>> everything
> about
>> >>>>> the LRU cache and evict stale entries (probably slows down GC a
>> lot, a
> huge PITA
>> >>>>> to implement, couples things that shouldn't be tightly coupled),
>> or clear the
>> >>>>> cache every time GC is run (probably would make appending so slow
>> as to
>> > defeat the
>> >>>>> purpose of having the cache).
>> >>>> I think GC.collect may simply evict the entire cache. The
>> collection
>> >>>> cycle costs so much, the marginal cost of losing cached
>> information is
>> >>>> lost in the noise.
>> >>>> Andrei
>> >>> But then you have to copy the whole array again, likely triggering
>> another GC if
>> >>> the array is large. Then things really get ugly as, for all
>> practical purposes,
>> >>> you've completely done away with the cache.
>> >> This happens whether or not a cache is in use.
>> >> Andrei
>> >
>> > But the array isn't guaranteed to get reallocated immediately after
>> *every* GC
>> > run. If you're appending to a huge array, the GC will likely run
>> several times
>> > while you're appending, leading to several unnecessary reallocations.
>> I don't think I understand this.
>> 1. Request for an append comes that runs out of memory
>> 2. GC runs and clears memory
>> 3. Array is reallocated and the capacity cached.
>> No?
>
> This is entirely correct.
>
>> > Each of
>> > those unnecessary reallocations will increase the memory footprint of
>> your
>> > program, possibly triggering another GC run and wiping out your cache
>> again in
>> > short order, until, for sufficiently large arrays,
>> >
>> > a ~= b;
>> >
>> > is almost equivalent to
>> >
>> > a = a ~ b;
>> I don't understand how the cache makes that all worse.
>> Andrei
>
> The cache doesn't make anything *worse* than with no cache. The only
> point I'm
> trying to make is that, for large arrays, if the GC clears the cache
> every time it
> runs, things would start to get *almost as bad as* having no cache
> because the
> copy operations become expensive and the GC may run frequently.
The cache can't be "cleared" every time, or else you might as well only
keep one LRU entry:
int[] twos, threes;
for(int i = 1; i < 10000; i++)
{
twos ~= i * 2;
threes ~= i * 3;
}
At some point, twos or threes needs an allocation triggering a collection,
and that clears the cache, making the other array need an allocation,
clearing the cache, etc.
I'd think you only want to clear the entries affected by the collection.
-Steve
More information about the Digitalmars-d
mailing list