[dmd-internals] Shift support for vector types (or: is vector type a first class type?)

Mon Apr 1 23:57:34 PDT 2013

On 4/1/2013 10:13 PM, Kai Nacke wrote:
> On 01.04.2013 04:31, Walter Bright wrote:
>>
>> On 3/31/2013 6:56 PM, Kai Nacke wrote:
>>> Hi!
>>>
>>> I try to write a generic vectorized version of SHA1. During that I noticed 
>>> that only some operations are allowed on vector types.
>>>
>>> For the SHA1 algorithm I need to implement a 'rotate left'. I like to write 
>>> something like this
>>>
>>>     uint4 w = ...;
>>>     uint4 v = (w << 1) | (w >> 31);
>>>
>>> which is not allowed by DMD.
>>>
>>> Is this by design or simply not implemented because the backend is not 
>>> capable of generating code for it? The TDPL says nothing about vector types. 
>>> My understanding of the language reference on the web 
>>> (http://dlang.org/simd.html) is that the supported operators are CPU 
>>> architecture dependent.
>>>
>>> I really like to see more support for vector operations in the language, 
>>> e.g. for shifting. What is the view of the language designers? Is the vector 
>>> type a first class type or just an architecture (maybe vendor) dependent 
>>> type with limited usability?
>>>
>>> Because LLVM treats the vector type as a first class type supporting more 
>>> operators is easy with LDC. See my pull request for shift operators here: 
>>> https://github.com/ldc-developers/ldc/pull/321
>>
>> The idea is if a vector operation is not supported by the underlying 
>> hardware, then dmd won't allow it. It specifically does not generate 
>> "workaround" code like gcc does. The reason for this is the workaround code 
>> is terribly, terribly slow (because moving code between the ALU and the SIMD 
>> unit is awful), and having the compiler silently insert it leaves the 
>> programmer mystified why he is getting execrable performance.
>
> Shifting a vector left by a single scalar e.g. v << 2 is then a missing 
> operation. It is supported by the PSLLW/D/Q instruction. Same for shifting 
> right. This is good news for my implementation. :-)

You can file a bugzilla for that one.

>
>> The vector design philosophy in D is if you write SIMD code, and it compiles, 
>> you can be confident it will execute in the SIMD unit of your particular 
>> target processor. You won't have to dump the assembler output to be sure.
>
> Would it be legal for a D compiler to generate "workaround" code?

No.

> Otherwise the language changes depending on the target.

That's correct.

> Consider again the left shift: on an Intel CPU only v << n (v: vector; n: 
> scalar) is valid. In contrast, Altivec allows v << w (v, w: vector). Then the 
> same source may or may not compile depending on the target (with an error 
> message saying 'incompatible types'). As a user of a cross compiler I would be 
> very surprised by this behavior.

The bigger surprise would be the silent and unpredictable execrably bad 
performance. The only reason to write SIMD code is for performance, and the 
compiler ought to give an error when it cannot deliver SIMD performance.

The workaround code can be 100x slower. This is a big deal.

> I really have Linux/PPC64 in mind but do most development on Windows...
> (It feels a bit like ++ is only supported if the underlying hardware has an 
> INC instruction...)

That's a different issue, since the workaround code is just as fast.