Any usable SIMD implementation?
Walter Bright via Digitalmars-d
digitalmars-d at puremagic.com
Tue Apr 12 23:22:37 PDT 2016
On 4/12/2016 4:29 PM, Marco Leise wrote:
> Am Tue, 12 Apr 2016 13:22:12 -0700
> schrieb Walter Bright <newshound2 at digitalmars.com>:
>
>> On 4/12/2016 9:53 AM, Marco Leise wrote:
>>> LDC implements InlineAsm_X86_Any (DMD style asm), so
>>> core.cpuid works. GDC is the only compiler that does not
>>> implement it. We agree that core.cpuid should provide this
>>> information, but what we have now - core.cpuid in a mix with
>>> GDC's lack of DMD style asm - does not work in practice for
>>> the years to come.
>>
>> Years? Anyone who needs core.cpuid could translate it to GDC's inline asm style
>> in an hour or so. It could even be simply written separately in GAS and linked
>> in. Since this has not been done, I can only conclude that core.cpuid has not
>> been an actual blocker.
>
> You mean it is ok, if I duplicated most of the asm in there
> and created a pull request ?
It's Boost licensed, and Boost licensed code can be shipped with GPL'd code as
far as I know.
> "mulq %[y]"
> : "=a" tmp.lo, "=d" tmp.hi : "a" x, [y] "rm" y;
I don't see anything elegant about those lines, starting with "mulq" is not in
any of the AMD or Intel CPU manuals. The assembler should notice that 'y' is a
ulong and select the 64 bit version of the MUL opcode automatically.
I can see nothing to recommend the:
"=a" tmp.lo
syntax. How about something comprehensible like "tmp.lo = EAX"? I bet people
could even figure that out without consulting stackoverflow! :-)
I have no idea what:
"a" x
and:
[y] "rm" y
mean, nor why the ":" appears sometimes and the "," other times. It does look
like it was designed by the same guy who invented TECO macros:
https://www.reddit.com/r/programming/comments/4e07lo/last_night_in_a_fit_of_boredom_far_away_from_my/d1xlbh7
but that's not much of a compliment.
> In practice GDC will just replace the invokation with a single
> 'mul' instruction while DMD will emit a call to this 18
> instructions long function. Now you keep telling me extended
> assembly is a step backwards. :)
DMD version:
DblWord bigMul(ulong x, ulong y) {
naked asm {
mov RAX,RDI;
mul RSI;
ret;
}
}
>> GCC's inline assembler apparently has no knowledge of what
>> the opcodes actually do.
> Agreed.
This is the basis of my assertion it is a step backwards. Granted, it has some
nice capability as you've demonstrated. But it sure makes you suffer to get it.
More information about the Digitalmars-d
mailing list