Any usable SIMD implementation?
9il via Digitalmars-d
digitalmars-d at puremagic.com
Mon Apr 4 14:05:44 PDT 2016
On Monday, 4 April 2016 at 20:29:11 UTC, Walter Bright wrote:
> On 4/4/2016 7:02 AM, 9il wrote:
>>> What kind of information?
>>
>> Target cpu configuration:
>> - CPU architecture (done)
>
> Done.
>
>> - Count of FP/Integer registers
>
> ??
How many general purpose registers, SIMD Floating Point
registers, SIMD Integer registers have a CPU?
>
>> - Allowed sets of instructions: for example, AVX2, FMA4
>
> Done. D_SIMD
This is not enough. Needs to know is it AVX or AVX2 in compile
time (this may be completely different source code for this
cases).
>
>> - Compiler optimization options (for math)
>
> Moot. DMD does not have compiler switches to set FP code
> generation. (This is deliberate.)
We have LDC and GDC. And looks like a little bit standardization
based on DMD would be good, even if this would be useless for DMD.
With compile time information about CPU it is possible to always
have fast generic BLAS for any target as soon as LLVM is released
for this target.
D+LLVM = fast generic BLAS. For DMD and GDC would be target
specified BLAS optimizations.
OpenBLAS kernels is 30 MB of assembler code! So we would be able
to replace it once and for a very long time with Phobos.
Best regards,
Ilya
More information about the Digitalmars-d
mailing list