core.simd woes

jerro a at a.com
Tue Oct 9 10:46:39 PDT 2012


On Tuesday, 9 October 2012 at 16:59:58 UTC, Jacob Carlborg wrote:
> On 2012-10-09 16:52, Simen Kjaeraas wrote:
>
>> Nope, like:
>>
>> module std.simd;
>>
>> version(Linux64) {
>>     public import std.internal.simd_linux64;
>> }
>>
>>
>> Then all std.internal.simd_* modules have the same public 
>> interface, and
>> only the version that fits /your/ platform will be included.
>
> Exactly, what he said.

I'm guessing the platform in this case would be the CPU 
architecture, since that determines what SIMD instructions are 
available, not the OS. But anyway, this does not address the 
problem Manu was talking about. The problem is that the API for 
the intrisics for the same architecture is not consistent across 
compilers. So for example, if you wanted to generate the 
instruction "movaps XMM1, XMM2, 0x88" (this extracts all even 
elements from two vectors), you would need to write:

version(GNU)
{
     return __builtin_ia32_shufps(a, b, 0x88);
}
else version(LDC)
{
     return shufflevector(a, b, 0, 2, 4, 6);
}
else version(DMD)
{
     // can't do that in DMD yet, but the way to do it will 
probably be different from the way it is done in LDC and GDC
}

What Manu meant with having std.simd.sse and std.simd.neon was to 
have modules that would provide access to the platform dependent 
instructions that would be portable across compilers. So for the 
shufps instruction above you would have something like this ins 
std.simd.sse:

float4 shufps(int i0, int i1, int i2, int i3)(float4 a, float4 
b){ ... }

std.simd currently takes care of cases when the code can be 
written in a cross platform way. But when you need to use 
platform specific instructions directly, std.simd doesn't 
currently help you, while std.simd.sse, std.simd.neon and others 
would. What Manu is worried about is that having instructions 
wrapped in another level of functions would hurt performance. It 
certainly would slow things down in debug builds (and IIRC he has 
written in his previous posts that he does care about that). I 
don't think it would make much of a difference when compiled with 
optimizations turned on, at least not with LDC and GDC.


More information about the Digitalmars-d mailing list