A few experiments with partial unrolling

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Sat Dec 25 13:38:35 PST 2010


On 12/25/10 3:07 PM, bearophile wrote:
> Andrei:
>
>> Third, it looks like larger unrolling limits is better - I only got a
>> plateau at 128!
>
> But this is true for a microbenchmark. In a real program the code half part of the CPU L1 cache is quite limited, so the more code you have to push through that little cache (code of different  functions), the more cache misses you have, and this slows down the code. This is why too much unrolling or too much inlining is bad, and this is why I have unrolled my sum() only once.

Yah, that's what I think unroll should be a generic function leaving it 
to the user to choose the parameters. Before that I'd like to generalize 
the function a bit more - right now it can only do reduce-type workloads 
on associative functions.

Andrei



More information about the Digitalmars-d mailing list