SIMD support...

Fri Jan 6 04:53:53 PST 2012

On 6 January 2012 11:04, Andrew Wiley <wiley.andrew.j at gmail.com> wrote:

> On Fri, Jan 6, 2012 at 2:43 AM, Walter Bright
> <newshound2 at digitalmars.com> wrote:
> > On 1/5/2012 5:42 PM, Manu wrote:
> >>
> >> So I've been hassling about this for a while now, and Walter asked me to
> >> pitch
> >> an email detailing a minimal implementation with some initial thoughts.
> >
> >
> > Takeaways:
> >
> > 1. SIMD behavior is going to be very machine specific.
> >
> > 2. Even trying to do something with + is fraught with peril, as integer
> adds
> > with SIMD can be saturated or unsaturated.
> >
> > 3. Trying to build all the details about how each of the various adds and
> > other ops work into the compiler/optimizer is a large undertaking. D
> would
> > have to support internally maybe a 100 or more new operators.
> >
> > So some simplification is in order, perhaps a low level layer that is
> fairly
> > extensible for new instructions, and for which a library can be layered
> over
> > for a more presentable interface. A half-formed idea of mine is, taking a
> > cue from yours:
> >
> > Declare one new basic type:
> >
> >    __v128
> >
> > which represents the 16 byte aligned 128 bit vector type. The only
> > operations defined to work on it would be construction and assignment.
> The
> > __ prefix signals that it is non-portable.
> >
> > Then, have:
> >
> >   import core.simd;
> >
> > which provides two functions:
> >
> >   __v128 simdop(operator, __v128 op1);
> >   __v128 simdop(operator, __v128 op1, __v128 op2);
> >
> > This will be a function built in to the compiler, at least for the x86.
> > (Other architectures can provide an implementation of it that simulates
> its
> > operation, but I doubt that it would be worth anyone's while to use
> that.)
> >
> > The operators would be an enum listing of the SIMD opcodes,
> >
> >    PFACC, PFADD, PFCMPEQ, etc.
> >
> > For:
> >
> >    z = simdop(PFADD, x, y);
> >
> > the compiler would generate:
> >
> >    MOV z,x
> >    PFADD z,y
> >
>
> Would this tie SIMD support directly to x86/x86_64, or would it
> possible to also support NEON on ARM (also 128 bit SIMD, see
>
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0409g/index.html
> ) ?
> (Obviously not for DMD, but if the syntax wasn't directly tied to
> x86/64, GDC and LDC could support this)
> It seems like using a standard naming convention instead of directly
> referencing instructions could let the underlying SIMD instructions
> vary across platforms, but I don't know enough about the technologies
> to say whether NEON's capabilities match SSE closely enough that they
> could be handled the same way.
>

The underlying architectures are too different to try and map opcodes
across architectures.
__v128 should map to each architecutres native SIMD type, allowing for the
compiler to express the hardware, but the opcodes would come from
architecture specific opcodes available in each compiler.

As I keep suggesting, LIBRARIES would be created to supply the types like
float4, int4, etc, which may also use version() liberally behind the scenes
to support all architectures, allowing a common and efficient API for all
architectures at this level.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20120106/03920bc3/attachment-0001.html>