Support for gcc vector attributes, SIMD builtins

Iain Buclaw ibuclaw at ubuntu.com
Tue Feb 1 08:48:09 PST 2011


== Quote from Mike Farnsworth (mike.farnsworth at gmail.com)'s article
> I built gdc from tip on Fedora 13 (x86-64) and started playing around
> with creating a vector struct (x,y,z,w) to see what kind of optimization
> the code generator did with it.  It was able to partially drop into SSE
> registers and instructions, but not as well as I had hoped from writing
> "regular" D code.
> I poked through the builtins that get pulled into d-builtins.c /
> d-builtins2.cc but I don't see anything that might be pulling in
> definitions such as __builtin_ia32_* for SSE, for example.
> How hard would it be to get some sort of vector attribute attached to a
> type (or just plain indroduce v4sf, __m128, or something like that) and
> get those SIMD builtins available?
> For the curious, here are how they are defined in, for example,
> xmmintrin.h for gcc:
> typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
> typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__,
> __artificial__))
> _mm_add_ps (__m128 __A, __m128 __B)
> {
>   return (__m128) __builtin_ia32_addps ((__v4sf)__A, (__v4sf)__B);
> }

Although GDC hashes out GCC builtins and attributes, most of it is very much
incomplete. For example, a D version (for GDC) of the code above would be
something like:


import gcc.builtins;

pragma(set_attribute, __m128, vector_size(16), may_alias);
pragma(set_attribute, __v4sf, vector_size(16));
pragma(set_attribute, _mm_add_ps, always_inline, artificial);

typedef float __m128;
typedef float __v4sf;

__m128 _mm_add_ps (__m128 __A, __m128 __B)
{
    return cast(__m128) __builtin_ia32_addps (cast(__v4sf)__A, cast(__v4sf)__B);
}



However, this doesn't work because

1) There is no 128bit float type in DMDFE (can be put in though, even if it is
just for internal use).
2) Vectors are not representable in DMDFE.

So __builtin_ia32_addps (and many other ia32 builtins) cannot be emitted to the D
environment.


Interestingly enough, this particular example actually ICEs the compiler. It
appears that while *explicit* casting is done in the code, DMDFE actually
*ignores* this, which is terrible on DMD's part...

Saying that, workaround is to use array types.
typedef float[4] __m128;
typedef float[4] __v4sf;


All the more reason to show you that pragma(attribute) is still very incomplete to
use. Any ideas to improve it are welcome though. :)



More information about the D.gnu mailing list