SIMD support...

Fri Jan 6 11:21:57 PST 2012

On 1/6/2012 5:44 AM, Manu wrote:
> The type safety you're imagining here might actually be annoying when working
> with the raw type and opcodes..
> Consider this common situation and the code that will be built around it:
> __v128 vec = { floatX, floatY, floatZ, unsigned int packedColour ); // pack some
> other useful data in W
> If vec were strongly typed, I would now need to start casting all over the place
> to use various float and uint opcodes on this value?
> I think it's correct when using SIMD at the raw level to express the type as it
> is, typeless... SIMD regs are infact typeless regs, they only gain concept of
> type the moment you perform an opcode on it, and only for the duration of that
> opcode.
>
> You will get your strong type safety when you make use of the float4 types which
> will be created in the libs.

Consider an analogy with the EAX register. It's untyped. But we don't find it 
convenient to make it untyped in a high level language, we paint the fiction of 
a type onto it, and that works very well.

To me, the advantage of making the SIMD types typed are:

1. the language does typechecking, for example, trying to add a vector of 4 
floats to 16 bytes would be (and should be) an error.

2. Some of the SIMD operations do map nicely onto the operators, so one could write:

    a = b + c + -d;

and the correct SIMD opcodes will be generated based on the types. I think that 
would be one hell of a lot nicer than using function syntax. Of course, this 
will only be for those SIMD ops that do map, for the rest you're stuck with the 
functions.

3. A lot of the SIMD opcodes have 10 variants, one for each of the 10 types. The 
user would only need to remember the operation, not the variants, and let the 
usual overloading rules apply.

And, of course, casting would be allowed and would be zero cost.

I've been thinking about this a lot since last night, and I think that since the 
back end already supports XMM registers, most of the hard work is done, that 
doing it this way would fit in well. (At least for 64 bit code, where the 
alignment issue is solved, but that's an orthogonal issue.)