SIMD/intrinsincs questions

Tue Nov 10 01:36:13 PST 2009

Bill Baxter wrote:
> On Mon, Nov 9, 2009 at 1:56 PM, Mike Farnsworth
> <mike.farnsworth at gmail.com> wrote:
>> Walter Bright Wrote:
>>
>>> Michael Farnsworth wrote:
>>>> The ldc guys tell me that they didn't
>>>> include the llvm vector intrinsics already because they were going to
>>>> need either a custom type in the frontend, or else the D2
>>>> fixed-size-arrays-as-value-types functionality.  I might take a stab at
>>>> some of that in ldc in the future to see if I can get it to work, but
>>>> I'm not an expert in compilers by any stretch of the imagination.
>>> I think there's a lot of potential in this. Most languages lack array
>>> operations, forcing the compiler into the bizarre task of trying to
>>> reconstruct high level operations from low level ones to then convert to
>>> array ops.
>> Can you elaborate a bit on what you mean?  If I understand what you're getting at, it's as simple as recognizing array-wise operations (the a[] = b[] * c expressions in D), and decomposing them into SIMD underneath where possible?  It would also be cool if the compiler could catch cases where a struct was essentially a wrapper around one of those arrays, and similarly turn the ops into SIMD ops (so as to allow some operator overloads and extra method wrapping additional intrinsics, for example).
>>
>> There are a lot of cases to recognize, but the compiler could start with the simple ones and then go from there with no need to change the language or declare custom types (minus some alignment to help it along, perhaps).  The nice thing about it is you automatically get a pretty big swath of auto-vectorization by the compiler in the most natural types and operations you'd expect it to show up.
>>
>> Of course, SOA-style SIMD takes more intervention by the programmer, but there is probably no easy way around that, since it's based on a data-layout technique.
> 
> I think what he's saying is use array expressions like a[] = b[] + c[]
> and let the compiler take care of it, instead of trying to write SSE
> yourself.
> 
> I haven't tried, but does this kind of thing turn into SSE and get inlined?
> 
> struct Vec3 {
>      float v[3];
>      void opAddAssign(ref Vec3 o) {
>          this.v[] += o.v[];
>      }
> }
> 
> If so then that's very slick.  Much nicer than having to delve into
> compiler intrinsics.
> 
> But at least on DMD I know it won't actually inline because it doesn't
> inline functions with ref arguments.
> (http://d.puremagic.com/issues/show_bug.cgi?id=2008)
> 
> --bb

The bad news: The DMD back-end is a state-of-the-art backend from the 
late 90's. Despite its age, its treatment of integer operations is, in 
general, still quite respectable. However, it _never_ generates SSE 
instructions. Ever. However, array operations _are_ detected, and they 
become to calls to library functions which use SSE if available. That's 
not bad for moderately large arrays -- 200 elements or so -- but of 
course it's completely non-optimal for short arrays.

The good news: Now that static arrays are passed by value, introducing 
inline SSE support for short arrays suddenly makes a lot of sense -- 
there can be a big performance benefit for a small backend change; it 
could be done without introducing SSE anywhere else. Most importantly, 
it doesn't require any auto-vectorisation support.