SIMD ideas for Rust
bearophile
bearophileHUGS at lycos.com
Fri Jul 19 10:43:24 PDT 2013
Manu:
> What you're really doing is casting a bunch of vector
> components to floats,
> and then rebuilding a vector, and LLVM can helpfully deal with
> that.
>
> I would suggest calling a spade a spade and using a swizzle
> function to
> perform a swizzle, instead of code like what you wrote.
> Wouldn't this be better:
>
> double2 complexMult(in double2 a, in double2 b) pure nothrow {
> double2 b_flip = b.yx; // or b.swizzle!"yx", if we don't
> want to
> include an opDispatch in the basic type
> double2 a_im = a.yy;
> double2 a_re = a.xx;
> double2 aib = a_im * b_flip;
> double2 arb = a_re * b;
I see and you are right.
(If I turn the basic type into a struct containing a double2
aliased-this to the whole structure, the generated code becomes
awful).
A YMM that already contains 8 floats, and probably SIMD registers
will keep growing, maybe to become 1024 bits long. So the swizzle
item names like x y z w will not suffice and some more general
naming scheme is needed.
> // return [arb[0] - aib[0], arb[1] + aib[1]]; // this final
> line is
> tricky... it's not very portable.
>
> // Maybe:
> return select([-1, 0], arb-aib, arb+aib);
> // Hopefully the x86 optimiser will generate the proper
> opcode. Or a
> bunch of other options; a multi-vector shuffle, shift, swizzle,
> interleave.
> }
>
> I think that would be better. More portable, and it eliminates
> the code
> that implies a vector->float->vector cast sequence, which I
> maintain,
> should be syntactically discouraged at all costs.
> You don't want to be giving people bad ideas that it's
> reasonable code to
> write ;)
My experience in writing such kind of code is limited. I will try
your select to see what kind of code LDC2-LLVM generates.
Bye,
bearophile
More information about the Digitalmars-d
mailing list