[OT] Use case for a 4-D matrix
Tomek Sowiński
just at ask.me
Wed Sep 8 12:13:02 PDT 2010
Dnia 08-09-2010 o 15:58:31 BCS <none at anon.com> napisał(a):
>> Can't you compute the Kronecker product lazily? E.g. a proxy object
>> that computes a value in an overloaded opIndex. Even if your
>> algorithms inspect (compute) the same value several times, you may
>> still win -- the bottleneck these days is memory access, not CPU
>> cycles.
>>
>
> If enough elements from the 4d matrix are accessed, in the wrong order,
> then the cache effects of doing it lazily might kill it. I'd guess that
> highly optimized code for doing the pre-compute version exists already.
Hm.. not sure what you mean by 'cache effects'. He was talking about
working with a 200^4 matrix of doubles, which is a result of Kronecker
product on two 200^2 matrices. Now, if my maths are right, the lazy
version needs (2*200^2) * 8 = 640000 bytes of memory. So the whole thing
fits comfortably into the on-die cache, and large chunks can be loaded to
the faster per-core caches.
I'd say if the cache effects can kill anything, it'd be accessing elements
of the precomputed result which is 200^4 * 8 = 12,800,000,000 bytes big.
Tomek
More information about the Digitalmars-d
mailing list