64-bit and SSE

Don nospam at nospam.com
Wed Mar 3 06:29:50 PST 2010


dsimcha wrote:
> == Quote from Don (nospam at nospam.com)'s article
>> Of course, in the occasions when SSE lets you do 4 operations at once,
>> you get nearly a 4X speedup...
> 
> Is SSE(2) inherently faster then (at least in real-world implementations) than
> x87, even when you don't vectorize? 

No. (Except on Pentium 4, where SSE was basically the only part of the 
CPU that wasn't crippled).

  Would I be able to expect any speedup from
> going from x87 to SSE(2) for code that has a decent amount of implicit instruction
> level parallelism but wasn't explicitly vectorized either by me or the compiler?

I doubt it.  The only time that you get an easy benefit is when you have 
a mix of serial and parallel calculations.

float[4] x, y;

float z = some_calculation;
x[] += z*y[];

If you're using SSE for all your calculations, z will already be in an 
SSE register, so it makes setting up the parallel calculation a bit quicker.

And the compiler might be better at scheduling SSE code, than x87. But 
that's not really a processor thing.




More information about the Digitalmars-d mailing list