SIMD support...

Manu turkeyman at gmail.com
Fri Jan 6 05:26:16 PST 2012


On 6 January 2012 12:16, a <a at a.com> wrote:

> Walter Bright Wrote:
>
> > which provides two functions:
> >
> >     __v128 simdop(operator, __v128 op1);
> >     __v128 simdop(operator, __v128 op1, __v128 op2);
>
> You would also need functions that take an immediate too to support
> instructions such as shufps.
>
> > One caveat is it is typeless; a __v128 could be used as 4 packed ints or
> 2
> > packed doubles. One problem with making it typed is it'll add 10 more
> types to
> > the base compiler, instead of one. Maybe we should just bite the bullet
> and do
> > the types:
> >
> >      __vdouble2
> >      __vfloat4
> >      __vlong2
> >      __vulong2
> >      __vint4
> >      __vuint4
> >      __vshort8
> >      __vushort8
> >      __vbyte16
> >      __vubyte16
>
> I don't see it being typeless as a problem. The purpose of this is to
> expose hardware capabilities to D code and the vector registers are
> typeless, so why shouldn't vector type be "typeless" too? Types such as
> vfloat4 can be implemented in a library (which could also be made portable
> and have a nice API).
>

Hooray! I think we're on exactly the same page. That's refreshing :)

I think this __simdop( op, v1, v2, etc ) api is a bit of a bad idea...
there are too many permutations of arguments.
I know some PPC functions that receive FIVE arguments (2-3 regs, and 2-3
literals)..
Why not just expose the opcodes as intrinsic functions directly, for
instance (maybe in std.simd.sse)?
__v128 __sse_mul_ss( __v128 v1, __v128 v2 );
__v128 __sse_mul_ps( __v128 v1, __v128 v2 );
__v128 __sse_madd_epi16( __v128 v1, __v128 v2, __v128 v3 ); // <- some have
more args
__v128 __sse_shuffle_ps( __v128 v1, __v128 v2, immutable int i ); // <-
some need literal ints
etc...

This works best for other architectures too I think, they expose their own
set of intrinsics, and some have rather different parameter layouts.
VMX for instance (perhaps in std.simd.vmx?):
__v128 __vmx_vmsum4fp( __v128 v1, __v128 v2, __v128 v3 );
__v128 __vmx_vpermwi( __v128 v1, immutable int i ); // <-- needs a literal
__v128 __vmx_vrlimi( __v128 v1, __v128 v2, immutable int
mask, immutable int rot ); // <-- you really don't want to add your enum
style function for all these prototypes?
etc...

I have seen at least these argument lists:
( v1 )
( v1, v2 )
( v1, v2, v3 )
( v1, immutable int )
( v1, v2, immutable int )
( v1, v2,  immutable int,  immutable int )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20120106/887bc0a3/attachment.html>


More information about the Digitalmars-d mailing list