Any usable SIMD implementation?

9il via Digitalmars-d digitalmars-d at puremagic.com
Mon Apr 4 14:05:44 PDT 2016


On Monday, 4 April 2016 at 20:29:11 UTC, Walter Bright wrote:
> On 4/4/2016 7:02 AM, 9il wrote:
>>> What kind of information?
>>
>> Target cpu configuration:
>> - CPU architecture (done)
>
> Done.
>
>> - Count of FP/Integer registers
>
> ??

How many general purpose registers, SIMD Floating Point 
registers, SIMD Integer registers have a CPU?

>
>> - Allowed sets of instructions: for example, AVX2, FMA4
>
> Done. D_SIMD

This is not enough. Needs to know is it AVX or AVX2 in compile 
time (this may be completely different source code for this 
cases).

>
>> - Compiler optimization options (for math)
>
> Moot. DMD does not have compiler switches to set FP code 
> generation. (This is deliberate.)

We have LDC and GDC. And looks like a little bit standardization 
based on DMD would be good, even if this would be useless for DMD.

With compile time information about CPU it is possible to always 
have fast generic BLAS for any target as soon as LLVM is released 
for this target.

D+LLVM = fast generic BLAS. For DMD and GDC would be target 
specified BLAS optimizations.

OpenBLAS kernels is 30 MB of assembler code! So we would be able 
to replace it once and for a very long time with Phobos.

Best regards,
Ilya


More information about the Digitalmars-d mailing list