SIMD benchmark

Timon Gehr timon.gehr at gmx.ch
Mon Jan 16 09:01:36 PST 2012


On 01/16/2012 05:59 PM, Manu wrote:
> On 16 January 2012 18:48, Andrei Alexandrescu
> <SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
> wrote:
>
>     On 1/16/12 10:46 AM, Manu wrote:
>
>         A function using float arrays and a function using hardware vectors
>         should certainly not be the same speed.
>
>
>     My point was that the version using float arrays should
>     opportunistically use hardware ops whenever possible.
>
>
> I think this is a mistake, because such a piece of code never exists
> outside of some context. If the context it exists within is all FPU code
> (and it is, it's a float array), then swapping between FPU and SIMD
> execution units will probably result in the function being slower than
> the original (also the float array is unaligned). The SIMD version
> however must exist within a SIMD context, since the API can't implicitly
> interact with floats, this guarantees that the context of each function
> matches that within which it lives.
> This is fundamental to fast vector performance. Using SIMD is an all or
> nothing decision, you can't just mix it in here and there.
> You don't go casting back and fourth between floats and ints on every
> other line... obviously it's imprecise, but it's also a major
> performance hazard. There is no difference here, except the performance
> hazard is much worse.

I think DMD now uses XMM registers for scalar floating point arithmetic 
on x86_64.


More information about the Digitalmars-d mailing list