SIMD support...

Thu Jan 12 12:13:32 PST 2012

On 06.01.2012 02:42, Manu wrote:
> I like v128, or something like that. I'll use that for the sake of this
> document. I think it is preferable to float4 for a few reasons...

I do not agree at all. That way, the type looses all semantic 
information. This is not only breaking with C/C++/D philosophy but 
actually *hides* an essential hardware detail on Intel SSE:

An SSE register is 128 bit, but the processor actually cares about the 
semantics of the content:

There are different commands for loading two doubles, four singles or 
integers to a register. They all load the same 128 bits from memory into 
the same register. Anyhow, the specs warn about a performance penalty 
when loading a register as one type and then using it as another. I do 
not know the internals of the processor, but my understanding is that 
the CPU splits the floats into mantissa, exponent and sign already at 
the moment of loading and has to drop that information when you 
reinterpret the bit pattern stored in the register.

A type v128 would not provide the necessary information for the compiler 
to produce the correct mov statements.

There definitely must be a float4 and a double2 type to express these 
semantics. For integers, I am not quite sure. I believe that integer SSE 
commands can be mixed more so a single 128bit type would be sufficient.

Considering these hardware details of the SSE architecture alone, I fear 
that portable low-level support for SIMD is very hard to achieve. If you 
want to offer access to the raw power of each architecture, it might be 
simpler to have machine-specific language extensions for SIMD and leave 
the portability for a wrapper library with a common front-end and 
various back-ends for the different architectures.