Performance test of short-circuiting AliasSeq

Stefan Koch uplink.coder at googlemail.com
Mon Jun 1 20:16:55 UTC 2020


Hi,

So I've asked myself if the PR 
https://github.com/dlang/dmd/pull/11057

Which complicated the compiler internals and got pulled on the 
basis that it would increase performance did actually increase 
performance.
So I ran it on a staticMap benchmark which it should speed up.

I am going to use the same benchmark as I used for
https://forum.dlang.org/post/kktulflpozdrsxeinfbg@forum.dlang.org

Firstly let me post the results of the benchmark without walters 
patch applied.

---
Benchmark #1: ./dmd.sh sm.d
   Time (mean ± σ):     436.0 ms ±  14.6 ms    [User: 384.0 ms, 
System: 51.8 ms]
   Range (min … max):   413.5 ms … 476.7 ms    40 runs

Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
   Time (mean ± σ):     219.9 ms ±   5.6 ms    [User: 210.0 ms, 
System: 10.0 ms]
   Range (min … max):   208.6 ms … 235.9 ms    40 runs

Benchmark #3: ./dmd.sh sm.d -version=Walter
   Time (mean ± σ):     330.4 ms ±   7.1 ms    [User: 290.8 ms, 
System: 39.5 ms]
   Range (min … max):   316.6 ms … 345.7 ms    40 runs

Summary
   './dmd.sh sm.d -version=DotDotDot' ran
     1.50 ± 0.05 times faster than './dmd.sh sm.d -version=Walter'
     1.98 ± 0.08 times faster than './dmd.sh sm.d'

---

If you care to compare the timings at the bottom they pretty much 
match the results I've measured in the benchmark previously and 
posted in the thread above.

Now let's see how the test performs with the patches applied.

Benchmark #1: ./dmd.sh sm.d
   Time (mean ± σ):     423.5 ms ±   8.9 ms    [User: 377.6 ms, 
System: 45.7 ms]
   Range (min … max):   411.0 ms … 444.0 ms    40 runs

Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
   Time (mean ± σ):     231.0 ms ±   4.3 ms    [User: 220.3 ms, 
System: 10.9 ms]
   Range (min … max):   223.3 ms … 243.9 ms    40 runs

Benchmark #3: ./dmd.sh sm.d -version=Walter
   Time (mean ± σ):     342.4 ms ±   8.1 ms    [User: 306.4 ms, 
System: 36.0 ms]
   Range (min … max):   331.0 ms … 375.1 ms    40 runs

Summary
   './dmd.sh sm.d -version=DotDotDot' ran
     1.48 ± 0.04 times faster than './dmd.sh sm.d -version=Walter'
     1.83 ± 0.05 times faster than './dmd.sh sm.d'

We see the difference between `...` and Walters unrolled 
staticMap shrink.
And we go see a decrease of the divide and conquer version of 
staticMap.

However We do see the mean times of the `...` and Walters 
unrolled staticMap actually increase.

That made me curious and I repeated the measurements with a 
higher repetition count. Just to make sure that this is not a 
spur of the moment thing.

---- "short-circuit" patch applied.
uplink at uplink-black:~/d/dmd-master/dmd(manudotdotdot)$ hyperfine 
"./dmd.sh sm.d" "./dmd.sh sm.d -version=DotDotDot" "./dmd.sh sm.d 
-version=Walter" -r 90
Benchmark #1: ./dmd.sh sm.d
   Time (mean ± σ):     425.9 ms ±  13.4 ms    [User: 373.7 ms, 
System: 51.9 ms]
   Range (min … max):   409.6 ms … 468.8 ms    90 runs

Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
   Time (mean ± σ):     234.3 ms ±   9.5 ms    [User: 224.1 ms, 
System: 10.2 ms]
   Range (min … max):   220.0 ms … 272.1 ms    90 runs

Benchmark #3: ./dmd.sh sm.d -version=Walter
   Time (mean ± σ):     340.6 ms ±   7.1 ms    [User: 299.7 ms, 
System: 40.9 ms]
   Range (min … max):   328.9 ms … 359.3 ms    90 runs

Summary
   './dmd.sh sm.d -version=DotDotDot' ran
     1.45 ± 0.07 times faster than './dmd.sh sm.d -version=Walter'
     1.82 ± 0.09 times faster than './dmd.sh sm.d'

This is consistent with what we got before.
For good measure (pun intended), I tested the DMD version without 
the patch with an increased repetition count as well.

---- without "short-circuit" patch:
uplink at uplink-black:~/d/dmd-master/dmd(manudotdotdot)$ hyperfine 
"./dmd.sh sm.d" "./dmd.sh sm.d -version=DotDotDot" "./dmd.sh sm.d 
-version=Walter" -r 90
Benchmark #1: ./dmd.sh sm.d
   Time (mean ± σ):     428.9 ms ±  11.3 ms    [User: 376.2 ms, 
System: 52.3 ms]
   Range (min … max):   412.8 ms … 464.5 ms    90 runs

Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
   Time (mean ± σ):     217.8 ms ±   5.2 ms    [User: 208.9 ms, 
System: 9.0 ms]
   Range (min … max):   209.0 ms … 241.6 ms    90 runs

Benchmark #3: ./dmd.sh sm.d -version=Walter
   Time (mean ± σ):     329.9 ms ±   9.4 ms    [User: 287.8 ms, 
System: 41.9 ms]
   Range (min … max):   318.7 ms … 364.6 ms    90 runs

Summary
   './dmd.sh sm.d -version=DotDotDot' ran
     1.51 ± 0.06 times faster than './dmd.sh sm.d -version=Walter'
     1.97 ± 0.07 times faster than './dmd.sh sm.d'

The results seem quite solid.
At leasr on benchmark I have used for "short-circuiting" AliasSeq 
leads to a 4% slowdown for walters unrolled staticMap. and a 7% 
slowdown for `...`

I think we should not assumed include performance improvements 
before measuring.




More information about the Digitalmars-d mailing list