__restrict, architecture intrinsics vs asm, consoles, and other

Marco Leise Marco.Leise at gmx.de
Thu Sep 22 08:28:31 PDT 2011


Am 22.09.2011, 08:39 Uhr, schrieb Don <nospam at nospam.com>:

> On 22.09.2011 05:24, a wrote:
>> How would one do something like this without intrinsics (the code is  
>> c++ using
>> gcc vector extensions):
>
> [snip]
> At present, you can't do it without ultimately resorting to inline asm.  
> But, what we've done is to move SIMD into the machine model: the D  
> machine model assumes that float[4] + float[4] is a more efficient  
> operation than a loop.
> Currently, only arithmetic operations are implemented, and on DMD at  
> least, they're still not proper intrinsics. So in the long term it'll be  
> possible to do it directly, but not yet.
>
> At various times, several of us have implemented 'swizzle' using CTFE,  
> giving you a syntax like:
>
> float[4] x, y;
> x[] = y[].swizzle!"cdcd"();
> // x[0]=y[2], x[1]=y[3], x[2]=y[2], x[3]=y[3]
>
> which compiles to a single shufps instruction.
>
> That "cdcd" string is really a tiny DSL: the language consists of four  
> characters, each of which is a, b, c, or d.
>
> A couple of years ago I made a DSL compiler for BLAS1 operations. It was  
> capable of doing some pretty wild stuff, even then. (The DSL looked like  
> normal D code).
> But the compiler has improved enormously since that time. It's now  
> perfectly feasible to make a DSL for the SIMD operations you need.
>
> The really nice thing about this, compared to normal asm, is that you  
> have access to the compiler's symbol table. This lets you add  
> compile-time error messages, for example.
>
> A funny thing about this, which I found after working on the DMD  
> back-end, is that is MUCH easier to write an optimizer/code generator in  
> a DSL in D, than in a compiler back-end.

That's a nice fresh approach to intrinsics. I bet if other languages had  
the CTFE capabilities, they'd probably do the same.
Sure, it is ideal if the compiler works magic here, but it takes longer to  
implement the right code generation in the compiler, than to write an  
isolated piece of library code and extensions can be added by anyone,  
especially since there will already be some examples to look at. Thumbs up!


More information about the Digitalmars-d mailing list