Any usable SIMD implementation?

Walter Bright via Digitalmars-d digitalmars-d at puremagic.com
Tue Apr 12 13:22:12 PDT 2016


On 4/12/2016 9:53 AM, Marco Leise wrote:
> LDC implements InlineAsm_X86_Any (DMD style asm), so
> core.cpuid works. GDC is the only compiler that does not
> implement it. We agree that core.cpuid should provide this
> information, but what we have now - core.cpuid in a mix with
> GDC's lack of DMD style asm - does not work in practice for
> the years to come.

Years? Anyone who needs core.cpuid could translate it to GDC's inline asm style 
in an hour or so. It could even be simply written separately in GAS and linked 
in. Since this has not been done, I can only conclude that core.cpuid has not 
been an actual blocker.


>> BTW, dmd's inline assembler does know about which instructions read/write which
>> registers, and makes use of that when inserting the code so it will work with
>> the rest of the code generator's register usage tracking.
>
> That is a pleasant surprise. :)

https://github.com/D-Programming-Language/dmd/blob/master/src/iasm.c#L1255


> Still, DMD does not inline asm and always adds a function
> prolog and epilog around asm blocks in an otherwise
> empty function (correct me if I'm wrong).

Not if you use "naked".

> "naked" means you
> have to duplicate code for the different calling conventions,
> in particular Win32.

Why complain about it adding a prolog/epilog, and complain about it not adding it?


> Your look on GCC (and LLVM) may be a bit biased. First of all
> you don't need to tell it exactly which registers to use. A
> rough classification is enough and gives the compiler a good
> idea of where calculations should be stored upon arrival at
> the asm statement. You can be specific down to the register
> name or let the backend chose freely with "rm" (= any register
> or memory).
> An example: We have a variable x that is computed inside a
> function followed by an asm block that multiplies it with
> something else. Typically you would "MOV EAX, [x]" to load x
> into the register that the MUL instruction expects. With
> extended assemblers you can be declarative about that and just
> state that x is needed in EAX as an input. You drop the MOV
> from the asm block and let the compiler figure out in its
> codegen, how x will end up in EAX. That's a step FORWARD in
> usability.

It's a step backwards because I can't just say "MUL EAX". I have to tell GCC 
what register the result gets put in. This is, to my mind, ridiculous. GCC's 
inline assembler apparently has no knowledge of what the opcodes actually do.



More information about the Digitalmars-d mailing list