Any usable SIMD implementation?
Walter Bright via Digitalmars-d
digitalmars-d at puremagic.com
Tue Apr 5 01:34:32 PDT 2016
On 4/4/2016 11:10 PM, 9il wrote:
> It is impossible to deduct from that combination that Xeon Phi has 32 FP registers.
Since dmd doesn't generate specific code for a Xeon Phi, having a compile time
switch for it is meaningless.
> "Since the compiler never generates AVX or AVX2" - this is definitely nor true,
> see, for example, LLVM vectorization and SLP vectorization.
dmd is not LLVM.
>> It's entirely practical to compile code with different source code, link them
>> *both* into the executable, and switch between them based on runtime detection
>> of the CPU.
> This approach is complex,
Not at all. Used to do it all the time in the DOS world (FPU vs emulation).
> I just want an unified instrument to receive CT information about target and
> optimization switches. It is OK if this information would have different
> switches on different compilers.
Optimizations simply do not transfer from one compiler to another, whether the
switch is the same or not. They are highly implementation dependent.
> Auto vectorization is only example (maybe bad). I would use SIMD vectors, but I
> need CT information about target CPU, because it is impossible to build optimal
> BLAS kernels without it!
I still don't understand why you cannot just set '-version=xxx' on the command
line and then switch off that version in your custom code.
More information about the Digitalmars-d
mailing list