Lazy caching map()?

H. S. Teoh hsteoh at quickfur.ath.cx
Tue Mar 13 17:24:35 UTC 2018


On Sat, Mar 10, 2018 at 08:54:16PM +0000, aliak via Digitalmars-d wrote:
> On Friday, 9 March 2018 at 19:41:46 UTC, H. S. Teoh wrote:
> > [...]  More precisely, given an array `T[] src` of source data and a
> > function func(T) that's pretty expensive to compute, return an
> > object `result` such that:
> > 
> > - result[i] == func(src[i]), for 0 ≤ i < src.length.
> > 
> > - If result[j] is never actually used, func(src[j]) is never invoked
> >   (lazy).
> > 
> > - If result[j] is referenced multiple times, a cached value is
> >   returned after the first time, i.e., the result of func(src[j]) is
> >   cached, and func(src[j]) is never invoked more than once.
[...]
> > Can this be done using current Phobos primitives?
[...]
> Unless I'm misunderstanding your requirements, are you looking for
> std.functionl.memoize?
> 
> auto fun(int a) {
>     writeln("here");
>     return a + 1;
> }
> 
> void main() {
>     auto a = [1, 2, 3];
>     auto b = a.map!(memoize!fun);
> 
>     writeln(b[1]);
>     writeln(b[1]);
>     writeln(b[1]);
> }
> 
> will print "here" just once.

Very nice. Using memoize did occur to me, but I needed an array
interface to it.  Didn't think of using memoize with map, for some
reason.  Thanks for the idea!

However, looking at the implementation of memoize, it seems that it's
caching the result globally, with very little control over the cache
except the size. I wonder if there's a way to control it better, i.e.,
free the cache once there are no more references to it, and have the
cache size depend on the data size.  Basically, in my use case, once I
map an array it's not predictable in advance how many elements will be
accessed (i.e., it's hard to decide on an optimal cache size for
memoize), but this access will happen pretty soon afterwards, and then
the array will be discarded (i.e., no point holding on old entries in
the cache).  I would prefer that the cache will be cleared once the
mapped array has been GC'd, but memoize() seems to hold on to the cached
results indefinitely.


T

-- 
Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen


More information about the Digitalmars-d mailing list