Playing SIMD

Don via Digitalmars-d digitalmars-d at puremagic.com
Mon Oct 26 05:35:38 PDT 2015


On Sunday, 25 October 2015 at 19:37:32 UTC, Iakh wrote:
> Here is my implementatation of SIMD find. Function returns 
> index of ubyte in static 16 byte array with unique values.

[snip]

You need to be very careful with doing benchmarks on tiny test 
cases, they can be very misleading.

Be aware that the speed of bsf() and bsr() is very very strongly 
processor dependent. On some machines, it is utterly pathetic. eg 
AMD K7, BSR is 23 micro-operations, on original pentium is was up 
to 73 (!), even on AMD Bobcat it is 11 micro-ops, but on recent 
Intel it is one micro-op. This fact of 73 can totally screw up 
your performance comparisons.

Just because it is a single machine instruction does not mean it 
is fast.



More information about the Digitalmars-d mailing list