SIMD ideas for Rust
bearophile
bearophileHUGS at lycos.com
Fri Jul 19 02:33:31 PDT 2013
Manu:
> Interesting. Almost all his points are what we do already in D.
> Always nice to see others come to the same conclusions :)
While trying to write a multiplication of two complex numbers
using SSE3 with LDC2 I have found about seven or more bugs, that
I will discuss elsewhere. But regarding the syntax, in nice code
like this D requires to add ".array" before all those subscripts
(code adapted from Fog):
double2 complexMult(in double2 a, in double2 b) pure nothrow {
double2 b_flip = [b.array[1], b.array[0]];
double2 a_im = [a.array[1], a.array[1]];
double2 a_re = [a.array[0], a.array[0]];
double2 aib = a_im * b_flip;
double2 arb = a_re * b;
return [arb.array[0] - aib.array[0], arb.array[1] +
aib.array[1]];
}
A line like this:
double2 b_flip = [b.array[1], b.array[0]];
becomes something like:
pshufd $238, %xmm1, %xmm3
Similarly all the other lines become single instructions (but the
last one, because LDC2 misses to use a addsubpd).
I vaguely remember you saying that slow SIMD operations shouldn't
have a too much short syntax to avoid giving an illusion of
efficiency. But given that "often" the CPU executes such array
subscripting and shuffling efficiently, isn't it nicer/enough to
support a simpler syntax like this in D?
double2 complexMult(in double2 a, in double2 b) pure nothrow {
double2 b_flip = [b[1], b[0]];
double2 a_im = [a[1], a[1]];
double2 a_re = [a[0], a[0]];
double2 aib = a_im * b_flip;
double2 arb = a_re * b;
return [arb[0] - aib[0], arb[1] + aib[1]];
}
Bye,
bearophile
More information about the Digitalmars-d
mailing list