M1 10x faster than Intel at integral division, throughput one 64-bit divide in two cycles

claptrap clap at trap.com
Thu May 13 23:50:10 UTC 2021


On Thursday, 13 May 2021 at 01:59:15 UTC, Andrei Alexandrescu 
wrote:
> https://www.reddit.com/r/programming/comments/nawerv/benchmarking_division_and_libdivide_on_apple_m1/
>
> Integral division is the strongest arithmetic operation.
>
> I have a friend who knows some M1 internals. He said it's 
> really Star Trek stuff.
>
> This will seriously challenge other CPU producers.

Integer division on Intel has always been excruciatingly slow, 64 
bit idiv can be up to 100 cycles in some cases, but DIVSD is like 
20 or something. Its much faster to convert to double do the 
division and convert back. (If you are ok with slightly lower 
precision.)

Just for reference I looked up timings for Zen3 and 64 bit idiv is

9-17 latency, 7-12 throughput

For skylake which is what it looks like the Xeon 8275CL is based 
on its..

42-95 latency, 24-90 throughput

So on paper a Zen3 is maybe 5 to 8 times faster at idiv than the 
Xeon he's using.





More information about the Digitalmars-d mailing list