Support for gcc vector attributes, SIMD builtins
Mike Farnsworth
mike.farnsworth at gmail.com
Tue Feb 1 10:32:57 PST 2011
Iain Buclaw Wrote:
> == Quote from Mike Farnsworth (mike.farnsworth at gmail.com)'s article
> > I built gdc from tip on Fedora 13 (x86-64) and started playing around
> > with creating a vector struct (x,y,z,w) to see what kind of optimization
> > the code generator did with it. It was able to partially drop into SSE
> > registers and instructions, but not as well as I had hoped from writing
> > "regular" D code.
> > I poked through the builtins that get pulled into d-builtins.c /
> > d-builtins2.cc but I don't see anything that might be pulling in
> > definitions such as __builtin_ia32_* for SSE, for example.
> > How hard would it be to get some sort of vector attribute attached to a
> > type (or just plain indroduce v4sf, __m128, or something like that) and
> > get those SIMD builtins available?
> > For the curious, here are how they are defined in, for example,
> > xmmintrin.h for gcc:
> > typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
> > typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> > extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__,
> > __artificial__))
> > _mm_add_ps (__m128 __A, __m128 __B)
> > {
> > return (__m128) __builtin_ia32_addps ((__v4sf)__A, (__v4sf)__B);
> > }
>
> Although GDC hashes out GCC builtins and attributes, most of it is very much
> incomplete. For example, a D version (for GDC) of the code above would be
> something like:
>
>
> import gcc.builtins;
>
> pragma(set_attribute, __m128, vector_size(16), may_alias);
> pragma(set_attribute, __v4sf, vector_size(16));
> pragma(set_attribute, _mm_add_ps, always_inline, artificial);
>
> typedef float __m128;
> typedef float __v4sf;
>
> __m128 _mm_add_ps (__m128 __A, __m128 __B)
> {
> return cast(__m128) __builtin_ia32_addps (cast(__v4sf)__A, cast(__v4sf)__B);
> }
>
>
>
> However, this doesn't work because
>
> 1) There is no 128bit float type in DMDFE (can be put in though, even if it is
> just for internal use).
> 2) Vectors are not representable in DMDFE.
>
> So __builtin_ia32_addps (and many other ia32 builtins) cannot be emitted to the D
> environment.
I figured this would be the case; the "typedef float whatever __attribute((vector_size(16)))" stuff is already weird, so I don't expect dmdfe to do the right thing with even similar syntax at all.
> Interestingly enough, this particular example actually ICEs the compiler. It
> appears that while *explicit* casting is done in the code, DMDFE actually
> *ignores* this, which is terrible on DMD's part...
Hah. It's obvious dmdfe doesn't understand that the builtin's signature correctly, so I'll hold off on a bug report until I can figure out what kind of signature that builtin had registered with dmdfe.
> Saying that, workaround is to use array types.
> typedef float[4] __m128;
> typedef float[4] __v4sf;
>
>
> All the more reason to show you that pragma(attribute) is still very incomplete to
> use. Any ideas to improve it are welcome though. :)
In my (not very abundant) spare time, I'll poke around the attribute stuff to see if I can attach the vector_size(16) attribute to a float[4] array type. I know the __builtin_ia32_addps function, for example, takes a v4sf (__m128 is just Intel's version that can change personalities at will; I feel no inclination to keep it around, and instead go with more strictly defined types and cast intrinsics). If I can get that builtin to take a typedef'd float[4] without a cast, perhaps dmdfe will not drop any data and the codegen will happen properly.
Where do I look to see the attribute pragmas in gdc? Where do I look to potentially change the signature that dmdfe sees for the __builtin_ia32_* functions? If I can get a hand-coded signature to work, then we'll be in business.
-Mike
More information about the D.gnu
mailing list