intel-intrinsics v1.0.0
Guillaume Piolat
first.last at gmail.com
Wed Feb 6 01:05:29 UTC 2019
"intel-intrinsics" is a DUB package for people interested in x86
performance that want neither to write assembly, nor a
LDC-specific snippet... and still have fastest possible code.
Available through DUB:
http://code.dlang.org/packages/intel-intrinsics
*** Features of v1.1.0:
- All intrinsics in this list:
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=MMX,SSE,SSE2 Use existing Intel documentation and syntax
- write the same code for both DMD and LDC, in the last 6
versions for each. (Note that debug performance might suffer a
lot when no inlining is activated.)
- Use operators on SIMD vectors as if core.simd were implemented
on DMD 32-bit
- Introduces int2 and float2 because short SIMD vectors are useful
- about 6000 LOC (for now! more to come)
- Bonus: approximated pow/exp/log. Perform 4 approximated pow at
once.
<future>
The long-term goal for this library is to be _only about
semantics_, and not particularly codegen(!). This is because LLVM
IR is portable, so forcing a particular instruction is undoing
this portability work. **This can seem odd** for an "intrinsics"
library but this way exact codegen options can be choosen by the
library user, and most intrinsics can gracefuly degrade to
portable IR in theory.
In the future, "magic" LLVM intrinsics will only be used when
built for x86, but I think all of it can become portable and not
x86-specific. Besides, there is a trend in LLVM to remove magic
intrinsics once they are doable with IR only.
</future>
tl;dr you can use "intel-intrinsics" today, and get quite-optimal
code with LDC, without duplication. You may come across early
bugs too.
http://code.dlang.org/packages/intel-intrinsics
(note: it's important to bench against vanilla D code or arrays
ops too, in some case the vanilla code wins)
More information about the Digitalmars-d-announce
mailing list