SIMD/intrinsincs questions
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Fri Nov 6 13:10:42 PST 2009
Don wrote:
> Mike Farnsworth wrote:
>> Don Wrote:
>>
>>> Mike Farnsworth wrote:
>>
>>>> In dmd and ldc, is there any support for SSE or other SIMD
>>>> intrinsics? I realize that I could write some asm blocks, but that
>>>> means each operation (vector add, sub, mul, dot product, etc.) would
>>>> need to probably include a prelude and postlude with loads and
>>>> stores. I worry that this will not get optimized away (unless I
>>>> don't use 'naked'?).
>>>>
>>>> In the alternative, is it possible to support something along the
>>>> lines of gcc's vector extensions:
>>>>
>>>> typedef int v4si __attribute__ ((vector_size (16)));
>>>> typedef float v4sf __attribute__ ((vector_size (16)));
>>>>
>>>> where the compiler will automatically generate opAdd, etc. for those
>>>> types? I'm not suggesting using gcc's syntax, of course, but you
>>>> get the idea.. It would provide a very easy way for the compiler to
>>>> prefer to keep 4-float vectors in SSE registers, pass them in
>>>> registers where appropriate in function calls, nuke lots of loads
>>>> and stores when inlining, etc.
>>>>
>>>> Having good, native SIMD support in D seems like a natural fit
>>>> (heck, it's got complex numbers built-in).
>>>>
>>>> Of course, there are some operations that the available SSE
>>>> intrinsics cover that the compiler can't expose via the typical
>>>> operators, so those still need to be supported somehow. Does anyone
>>>> know if ldc or dmd has those, or if they'll optimize away SSE loads
>>>> and stores if I roll my own structs with asm blocks? I saw from the
>>>> ldc source it had the usual llvm intrinsics, but as far as
>>>> hardware-specific codegen intrinsics I couldn't spot any.
>>>>
>>>> Thanks,
>>>> Mike Farnsworth
>>> Hi Mike, Welcome to D!
>>> In the latest compiler release (ie, this morning!), fixed-length
>>> arrays have become value types. This is a big step: it means that
>>> (eg) float[4] can be returned from a function for the first time. On
>>> 32-bit, we're a bit limited in SSE support (eg, since *no* 32-bit AMD
>>> processors have SSE2) -- but this will mean that on 64 bit, we'll be
>>> able to define an ABI in which short static arrays are passed in SSE
>>> registers.
>>>
>>> Also, D has array operations. If x, y, and z are int[4], then
>>> x[] = y[]*3 + z[];
>>> corresponds directly to SIMD operations. DMD doesn't do much with
>>> them yet (there's been so many language design issues that
>>> optimisation hasn't received much attention), but the language has
>>> definitely been planned with SIMD in mind.
>>
>>
>> Awesome, does this also apply to dynamic arrays? And how far does
>> that go? E.g. if I were to do something odd like:
>>
>> x[] = ((y[] % 5) ^ 2) + z[];
>
> Yes, that works, and it applies to dynamic arrays too. A key idea behind
> this is that since modern machines support SIMD, it's quite ridiculous
> for a high level languages to not be able to express it.
Mike, for more info on the supported operations you may want to refer to
the Thermopylae excerpt:
http://erdani.com/d/thermopylae.pdf
Andrei
More information about the Digitalmars-d
mailing list