Good dotProduct

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Jun 29 19:03:20 PDT 2010


bearophile wrote:
> Andrei Alexandrescu:
>> You already have a loop at the end that takes care of the stray 
>> elements. Why not move it to the beginning to take care of the stray 
>> elements _and_ unaligned elements in one shot?
> 
> Unfortunately things aren't that simple, you need a pre-loop and a post-loop. That asm block can process only aligned values in groups of 8 floats. So if you have an unaligned array of N items, that starts and ends before and after a block of processable items, you need to process both head and tail separately.
> 
> a floats:           ** **** **** **** **** ***
> Loop blocks: xxxxxxxxx xxxxxxxxx xxxxxxxxx xxxxxxxxx
> 16b aligned: ____ ____ ____ ____ ____ ____ ____ ____
> 
> if you don't understand still, I can create an example.
> 
> Bye,
> bearophile

Oh I see. Yah, it looks like both before and after loops are needed. 
Liquid fuel rocket, ion engine, and parachute.

Andrei


More information about the Digitalmars-d mailing list