Support for gcc vector attributes, SIMD builtins
Iain Buclaw
ibuclaw at ubuntu.com
Tue Feb 1 10:38:30 PST 2011
== Quote from Jerry Quinn (jlquinn at optonline.net)'s article
> Iain Buclaw Wrote:
> > == Quote from Mike Farnsworth (mike.farnsworth at gmail.com)'s article
> > > I built gdc from tip on Fedora 13 (x86-64) and started playing around
> > > with creating a vector struct (x,y,z,w) to see what kind of optimization
> > > the code generator did with it. It was able to partially drop into SSE
> > > registers and instructions, but not as well as I had hoped from writing
> > > "regular" D code.
> > > I poked through the builtins that get pulled into d-builtins.c /
> > > d-builtins2.cc but I don't see anything that might be pulling in
> > > definitions such as __builtin_ia32_* for SSE, for example.
> > > How hard would it be to get some sort of vector attribute attached to a
> > > type (or just plain indroduce v4sf, __m128, or something like that) and
> > > get those SIMD builtins available?
> >
> > Saying that, workaround is to use array types.
> > typedef float[4] __m128;
> > typedef float[4] __v4sf;
> >
> >
> > All the more reason to show you that pragma(attribute) is still very incomplete to
> > use. Any ideas to improve it are welcome though. :)
> The workaround actually looks like a cleaner way to define types for vector
intrinsics. How hard would it be to export vector intrinsics so the API expects
float[4], for example?
I haven't given it much thought on how internal representation could be, but I'd
lean on using unions in D code for usage in the language. As its probably most
portable.
For example, one of the older 'hello vectors' I know of:
import std.c.stdio;
pragma(set_attribute, __v4sf, vector_size(16));
typedef float __v4sf;
union f4vector
{
__v4sf v;
float[4] f;
}
int main()
{
f4vector a, b, c;
a.f = [1, 2, 3, 4];
b.f = [5, 6, 7, 8];
c.v = a.v + b.v;
printf("%f, %f, %f, %f\n", c.f[0], c.f[1], c.f[2], c.f[3]);
return 0;
}
Compile: gdc -c -g -msse hellovector.d
Dump Object: objdump -dS hellovector.o'
And the output of the SIMD operation speaks for itself:
c.v = a.v + b.v;
xorps %xmm1,%xmm1
movlps %gs:0x0,%xmm1
movhps %gs:0x8,%xmm1
xorps %xmm0,%xmm0
movlps %gs:0x0,%xmm0
movhps %gs:0x8,%xmm0
addps %xmm1,%xmm0
movlps %xmm0,%gs:0x0
movhps %xmm0,%gs:0x8
Regards.
Iain
More information about the D.gnu
mailing list