M1 10x faster than Intel at integral division, throughput one 64-bit divide in two cycles
Ola Fosheim Grøstad
ola.fosheim.grostad at gmail.com
Thu May 13 22:40:06 UTC 2021
On Thursday, 13 May 2021 at 12:06:01 UTC, Witold Baryluk wrote:
> Next time, exercise more critical thinking when reading
> "benchmark" claims.
Indeed, proper benchmarks use application suites, not shoehorned
synthetic garble... Besides, most performance sensitive code does
not use division much if the programmers know what they are
doing. And in this "benchmark" the division could've been moved
out of the inner loop by a less-than-braindead compiler.
Looks like Intel is releasing a Clang based C++ compiler with
OpenMP offload to Intel GPUs... Wonder if anyone knows anything
about it?
More information about the Digitalmars-d
mailing list