D vs VM-based platforms
Daniel Keep
daniel.keep.lists at gmail.com
Tue May 1 20:03:54 PDT 2007
Benji Smith wrote:
> Daniel Keep wrote:
>> Sorry; yes, you're right: it's for a single dot product.
>>
>> I'm surprised at this because of the sheer number of articles I ran
>> across touting "faster" dot product functions using SSE. I have a
>> feeling these people have never bothered to actually *benchmark* their
>> "faster" functions :P
>>
>> -- Daniel
>
>
> I'm also assuming that's for some low-dimensionality vector? I'd
> likewise guess that there's some sweet spot where dot product
> calculation is faster with SSE, even for a single pair of vectors, if
> the vectors are of sufficient dimensionality.
>
> --benji
3D single-precision. The problem seems to be a combination of unaligned
loads, and the trickery you have to resort to in order to sum the XMM
register horizontally. There's a dot product instruction in SSE4, but I
don't have a CPU that supports it. :P
It also doesn't help that the compiler will inline the FPU functions,
but won't inline the SSE ones.
-- Daniel
--
int getRandomNumber()
{
return 4; // chosen by fair dice roll.
// guaranteed to be random.
}
http://xkcd.com/
v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
More information about the Digitalmars-d
mailing list