Performance test of short-circuiting AliasSeq

Stefan Koch uplink.coder at googlemail.com
Wed Jun 3 14:52:09 UTC 2020


On Monday, 1 June 2020 at 20:16:55 UTC, Stefan Koch wrote:
> Hi,
>
> So I've asked myself if the PR 
> https://github.com/dlang/dmd/pull/11057
>
> Which complicated the compiler internals and got pulled on the 
> basis that it would increase performance did actually increase 
> performance.
> So I ran it on a staticMap benchmark which it should speed up.
>
> I am going to use the same benchmark as I used for
> https://forum.dlang.org/post/kktulflpozdrsxeinfbg@forum.dlang.org
>
> Firstly let me post the results of the benchmark without 
> walters patch applied.
>
> ---
> Benchmark #1: ./dmd.sh sm.d
>   Time (mean ± σ):     436.0 ms ±  14.6 ms    [User: 384.0 ms, 
> System: 51.8 ms]
>   Range (min … max):   413.5 ms … 476.7 ms    40 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
>   Time (mean ± σ):     219.9 ms ±   5.6 ms    [User: 210.0 ms, 
> System: 10.0 ms]
>   Range (min … max):   208.6 ms … 235.9 ms    40 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
>   Time (mean ± σ):     330.4 ms ±   7.1 ms    [User: 290.8 ms, 
> System: 39.5 ms]
>   Range (min … max):   316.6 ms … 345.7 ms    40 runs
>
> Summary
>   './dmd.sh sm.d -version=DotDotDot' ran
>     1.50 ± 0.05 times faster than './dmd.sh sm.d 
> -version=Walter'
>     1.98 ± 0.08 times faster than './dmd.sh sm.d'
>
> ---
>
> If you care to compare the timings at the bottom they pretty 
> much match the results I've measured in the benchmark 
> previously and posted in the thread above.
>
> Now let's see how the test performs with the patches applied.
>
> Benchmark #1: ./dmd.sh sm.d
>   Time (mean ± σ):     423.5 ms ±   8.9 ms    [User: 377.6 ms, 
> System: 45.7 ms]
>   Range (min … max):   411.0 ms … 444.0 ms    40 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
>   Time (mean ± σ):     231.0 ms ±   4.3 ms    [User: 220.3 ms, 
> System: 10.9 ms]
>   Range (min … max):   223.3 ms … 243.9 ms    40 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
>   Time (mean ± σ):     342.4 ms ±   8.1 ms    [User: 306.4 ms, 
> System: 36.0 ms]
>   Range (min … max):   331.0 ms … 375.1 ms    40 runs
>
> Summary
>   './dmd.sh sm.d -version=DotDotDot' ran
>     1.48 ± 0.04 times faster than './dmd.sh sm.d 
> -version=Walter'
>     1.83 ± 0.05 times faster than './dmd.sh sm.d'
>
> We see the difference between `...` and Walters unrolled 
> staticMap shrink.
> And we go see a decrease of the divide and conquer version of 
> staticMap.
>
> However We do see the mean times of the `...` and Walters 
> unrolled staticMap actually increase.
>
> That made me curious and I repeated the measurements with a 
> higher repetition count. Just to make sure that this is not a 
> spur of the moment thing.
>
> ---- "short-circuit" patch applied.
> uplink at uplink-black:~/d/dmd-master/dmd(manudotdotdot)$ 
> hyperfine "./dmd.sh sm.d" "./dmd.sh sm.d -version=DotDotDot" 
> "./dmd.sh sm.d -version=Walter" -r 90
> Benchmark #1: ./dmd.sh sm.d
>   Time (mean ± σ):     425.9 ms ±  13.4 ms    [User: 373.7 ms, 
> System: 51.9 ms]
>   Range (min … max):   409.6 ms … 468.8 ms    90 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
>   Time (mean ± σ):     234.3 ms ±   9.5 ms    [User: 224.1 ms, 
> System: 10.2 ms]
>   Range (min … max):   220.0 ms … 272.1 ms    90 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
>   Time (mean ± σ):     340.6 ms ±   7.1 ms    [User: 299.7 ms, 
> System: 40.9 ms]
>   Range (min … max):   328.9 ms … 359.3 ms    90 runs
>
> Summary
>   './dmd.sh sm.d -version=DotDotDot' ran
>     1.45 ± 0.07 times faster than './dmd.sh sm.d 
> -version=Walter'
>     1.82 ± 0.09 times faster than './dmd.sh sm.d'
>
> This is consistent with what we got before.
> For good measure (pun intended), I tested the DMD version 
> without the patch with an increased repetition count as well.
>
> ---- without "short-circuit" patch:
> uplink at uplink-black:~/d/dmd-master/dmd(manudotdotdot)$ 
> hyperfine "./dmd.sh sm.d" "./dmd.sh sm.d -version=DotDotDot" 
> "./dmd.sh sm.d -version=Walter" -r 90
> Benchmark #1: ./dmd.sh sm.d
>   Time (mean ± σ):     428.9 ms ±  11.3 ms    [User: 376.2 ms, 
> System: 52.3 ms]
>   Range (min … max):   412.8 ms … 464.5 ms    90 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
>   Time (mean ± σ):     217.8 ms ±   5.2 ms    [User: 208.9 ms, 
> System: 9.0 ms]
>   Range (min … max):   209.0 ms … 241.6 ms    90 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
>   Time (mean ± σ):     329.9 ms ±   9.4 ms    [User: 287.8 ms, 
> System: 41.9 ms]
>   Range (min … max):   318.7 ms … 364.6 ms    90 runs
>
> Summary
>   './dmd.sh sm.d -version=DotDotDot' ran
>     1.51 ± 0.06 times faster than './dmd.sh sm.d 
> -version=Walter'
>     1.97 ± 0.07 times faster than './dmd.sh sm.d'
>
> The results seem quite solid.
> At leasr on benchmark I have used for "short-circuiting" 
> AliasSeq leads to a 4% slowdown for walters unrolled staticMap. 
> and a 7% slowdown for `...`
>
> I think we should not assumed include performance improvements 
> before measuring.


The reason the old version of staticMap did not see the slowdown 
is because I didn't disable codegen.

code-gen inefficiencies occurring when emitting an unreasonable 
number to symbols tend to hide other problems.

Here is a Benchmark which does not relay on our branch
But uses the released dmd 2.092.0

Enjoy!

uplink at uplink-black:~/d/dmd-master/dmd(stable)$ hyperfine 
"./dmd_without_patch sm.d -c -o- -version=Walter" 
"./dmd_with_patch sm.d -c -o- -version=Walter" -m 90
Benchmark #1: ./dmd_without_patch sm.d -c -o- -version=Walter
   Time (mean ± σ):     452.8 ms ±   7.5 ms    [User: 415.5 ms, 
System: 37.4 ms]
   Range (min … max):   442.2 ms … 483.9 ms    90 runs

Benchmark #2: ./dmd_with_patch sm.d -c -o- -version=Walter
   Time (mean ± σ):     455.1 ms ±  10.4 ms    [User: 417.3 ms, 
System: 37.7 ms]
   Range (min … max):   441.5 ms … 489.2 ms    90 runs

Summary
   './dmd_without_patch sm.d -c -o- -version=Walter' ran
     1.00 ± 0.03 times faster than './dmd_with_patch sm.d -c -o- 
-version=Walter'
uplink at uplink-black:~/d/dmd-master/dmd(stable)$ hyperfine 
"./dmd_without_patch sm.d -c -o-" "./dmd_with_patch sm.d -c -o-" 
-m 90
Benchmark #1: ./dmd_without_patch sm.d -c -o-
   Time (mean ± σ):     583.2 ms ±  11.0 ms    [User: 529.9 ms, 
System: 53.1 ms]
   Range (min … max):   570.0 ms … 631.0 ms    90 runs

Benchmark #2: ./dmd_with_patch sm.d -c -o-
   Time (mean ± σ):     584.3 ms ±  14.3 ms    [User: 533.1 ms, 
System: 51.0 ms]
   Range (min … max):   566.5 ms … 657.9 ms    90 runs

Summary
   './dmd_without_patch sm.d -c -o-' ran
     1.00 ± 0.03 times faster than './dmd_with_patch sm.d -c -o-'
uplink at uplink-black:~/d/dmd-master/dmd(stable)$ hyperfine 
"./dmd_without_patch sm.d -c -o-" "./dmd_with_patch sm.d -c -o-" 
-m 90
Benchmark #1: ./dmd_without_patch sm.d -c -o-
   Time (mean ± σ):     583.4 ms ±  10.5 ms    [User: 529.2 ms, 
System: 54.0 ms]
   Range (min … max):   566.9 ms … 624.0 ms    90 runs

Benchmark #2: ./dmd_with_patch sm.d -c -o-
   Time (mean ± σ):     585.9 ms ±  13.9 ms    [User: 530.5 ms, 
System: 55.2 ms]
   Range (min … max):   565.0 ms … 631.7 ms    90 runs

Summary
   './dmd_without_patch sm.d -c -o-' ran
     1.00 ± 0.03 times faster than './dmd_with_patch sm.d -c -o-'




More information about the Digitalmars-d mailing list