SIMD ideas for Rust

Fri Jul 19 02:33:31 PDT 2013

Manu:

> Interesting. Almost all his points are what we do already in D.
> Always nice to see others come to the same conclusions :)

While trying to write a multiplication of two complex numbers 
using SSE3 with LDC2 I have found about seven or more bugs, that 
I will discuss elsewhere. But regarding the syntax, in nice code 
like this D requires to add ".array" before all those subscripts 
(code adapted from Fog):

double2 complexMult(in double2 a, in double2 b) pure nothrow {
     double2 b_flip = [b.array[1], b.array[0]];
     double2 a_im = [a.array[1], a.array[1]];
     double2 a_re = [a.array[0], a.array[0]];
     double2 aib = a_im * b_flip;
     double2 arb = a_re * b;
     return [arb.array[0] - aib.array[0], arb.array[1] + 
aib.array[1]];
}

A line like this:

double2 b_flip = [b.array[1], b.array[0]];

becomes something like:

pshufd   $238,  %xmm1, %xmm3

Similarly all the other lines become single instructions (but the 
last one, because LDC2 misses to use a addsubpd).

I vaguely remember you saying that slow SIMD operations shouldn't 
have a too much short syntax to avoid giving an illusion of 
efficiency. But given that "often" the CPU executes such array 
subscripting and shuffling efficiently, isn't it nicer/enough to 
support a simpler syntax like this in D?

double2 complexMult(in double2 a, in double2 b) pure nothrow {
     double2 b_flip = [b[1], b[0]];
     double2 a_im = [a[1], a[1]];
     double2 a_re = [a[0], a[0]];
     double2 aib = a_im * b_flip;
     double2 arb = a_re * b;
     return [arb[0] - aib[0], arb[1] + aib[1]];
}

Bye,
bearophile