Any usable SIMD implementation?
Marco Leise via Digitalmars-d
digitalmars-d at puremagic.com
Mon Apr 4 09:21:15 PDT 2016
Am Mon, 04 Apr 2016 14:02:03 +0000
schrieb 9il <ilyayaroshenko at gmail.com>:
> Target cpu configuration:
> - CPU architecture (done)
> - Count of FP/Integer registers
> - Allowed sets of instructions: for example, AVX2, FMA4
> - Compiler optimization options (for math)
>
> Ilya
- On amd64, whether floating-point math is handled by the FPU
or SSE. When emulating floating-point, e.g. for
float-to-string and string-to-float code, it is useful to
know where to get the active rounding mode from, since they
may differ and at least GCC has a switch to choose between
both.
- For compile time enabling of SSE4 code, a version define is
sufficient. Sometimes we want to select a code path at
runtime. For this to work, GDC and LDC use a conservative
feature set at compile time (e.g. amd64 with SSE2) and tag
each SSE4 function with an attribute to temporarily elevate
the instruction set. (e.g. @attribute("target", "+sse4"))
If you didn't tag the function like that the compiler would
error out, because the SSE4 instructions are not supported
by a minimal amd64 CPU.
To put this to good use, we need a reliable way - basically
a global variable - to check for SSE4 (or POPCNT, etc.). What
we have now does not work across all compilers.
--
Marco
More information about the Digitalmars-d
mailing list