Playing SIMD

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Mon Oct 26 09:09:53 PDT 2015


On 10/26/2015 08:35 AM, Don wrote:
> On Sunday, 25 October 2015 at 19:37:32 UTC, Iakh wrote:
>> Here is my implementatation of SIMD find. Function returns index of
>> ubyte in static 16 byte array with unique values.
>
> [snip]
>
> You need to be very careful with doing benchmarks on tiny test cases,
> they can be very misleading.
>
> Be aware that the speed of bsf() and bsr() is very very strongly
> processor dependent. On some machines, it is utterly pathetic. eg AMD
> K7, BSR is 23 micro-operations, on original pentium is was up to 73 (!),
> even on AMD Bobcat it is 11 micro-ops, but on recent Intel it is one
> micro-op. This fact of 73 can totally screw up your performance
> comparisons.
>
> Just because it is a single machine instruction does not mean it is fast.

One other note: don't compare with binary search, it's not an 
appropriate baseline. You should use it only if you implemented 
SIMD-based binary search.

Good baselines: std.find, memchr, a naive version with pointers (no 
bounds checking).


Andrei




More information about the Digitalmars-d mailing list