[OT] Use case for a 4-D matrix

Tomek Sowiński just at ask.me
Wed Sep 8 12:13:02 PDT 2010


Dnia 08-09-2010 o 15:58:31 BCS <none at anon.com> napisał(a):

>> Can't you compute the Kronecker product lazily? E.g. a proxy object
>> that  computes a value in an overloaded opIndex. Even if your
>> algorithms inspect  (compute) the same value several times, you may
>> still win -- the  bottleneck these days is memory access, not CPU
>> cycles.
>>
>
> If enough elements from the 4d matrix are accessed, in the wrong order,  
> then the cache effects of doing it lazily might kill it. I'd guess that  
> highly optimized code for doing the pre-compute version exists already.

Hm.. not sure what you mean by 'cache effects'. He was talking about  
working with a 200^4 matrix of doubles, which is a result of Kronecker  
product on two 200^2 matrices. Now, if my maths are right, the lazy  
version needs (2*200^2) * 8 = 640000 bytes of memory. So the whole thing  
fits comfortably into the on-die cache, and large chunks can be loaded to  
the faster per-core caches.

I'd say if the cache effects can kill anything, it'd be accessing elements  
of the precomputed result which is 200^4 * 8 = 12,800,000,000 bytes big.


Tomek


More information about the Digitalmars-d mailing list