Any usable SIMD implementation?

Walter Bright via Digitalmars-d digitalmars-d at puremagic.com
Tue Apr 12 23:22:37 PDT 2016


On 4/12/2016 4:29 PM, Marco Leise wrote:
> Am Tue, 12 Apr 2016 13:22:12 -0700
> schrieb Walter Bright <newshound2 at digitalmars.com>:
>
>> On 4/12/2016 9:53 AM, Marco Leise wrote:
>>> LDC implements InlineAsm_X86_Any (DMD style asm), so
>>> core.cpuid works. GDC is the only compiler that does not
>>> implement it. We agree that core.cpuid should provide this
>>> information, but what we have now - core.cpuid in a mix with
>>> GDC's lack of DMD style asm - does not work in practice for
>>> the years to come.
>>
>> Years? Anyone who needs core.cpuid could translate it to GDC's inline asm style
>> in an hour or so. It could even be simply written separately in GAS and linked
>> in. Since this has not been done, I can only conclude that core.cpuid has not
>> been an actual blocker.
>
> You mean it is ok, if I duplicated most of the asm in there
> and created a pull request ?

It's Boost licensed, and Boost licensed code can be shipped with GPL'd code as 
far as I know.


>            "mulq %[y]"
>            : "=a" tmp.lo, "=d" tmp.hi : "a" x, [y] "rm" y;

I don't see anything elegant about those lines, starting with "mulq" is not in 
any of the AMD or Intel CPU manuals. The assembler should notice that 'y' is a 
ulong and select the 64 bit version of the MUL opcode automatically.

I can see nothing to recommend the:

     "=a" tmp.lo

syntax. How about something comprehensible like "tmp.lo = EAX"? I bet people 
could even figure that out without consulting stackoverflow! :-)

I have no idea what:

    "a" x

and:

     [y] "rm" y

mean, nor why the ":" appears sometimes and the "," other times. It does look 
like it was designed by the same guy who invented TECO macros:

 
https://www.reddit.com/r/programming/comments/4e07lo/last_night_in_a_fit_of_boredom_far_away_from_my/d1xlbh7

but that's not much of a compliment.


> In practice GDC will just replace the invokation with a single
> 'mul' instruction while DMD will emit a call to this 18
> instructions long function. Now you keep telling me extended
> assembly is a step backwards. :)

DMD version:

   DblWord bigMul(ulong x, ulong y) {
     naked asm {
        mov RAX,RDI;
        mul RSI;
        ret;
      }
   }


>> GCC's inline assembler apparently has no knowledge of what
>> the opcodes actually do.
> Agreed.

This is the basis of my assertion it is a step backwards. Granted, it has some 
nice capability as you've demonstrated. But it sure makes you suffer to get it.



More information about the Digitalmars-d mailing list