Performance test of short-circuiting AliasSeq
Stefan Koch
uplink.coder at googlemail.com
Wed Jun 3 14:52:09 UTC 2020
On Monday, 1 June 2020 at 20:16:55 UTC, Stefan Koch wrote:
> Hi,
>
> So I've asked myself if the PR
> https://github.com/dlang/dmd/pull/11057
>
> Which complicated the compiler internals and got pulled on the
> basis that it would increase performance did actually increase
> performance.
> So I ran it on a staticMap benchmark which it should speed up.
>
> I am going to use the same benchmark as I used for
> https://forum.dlang.org/post/kktulflpozdrsxeinfbg@forum.dlang.org
>
> Firstly let me post the results of the benchmark without
> walters patch applied.
>
> ---
> Benchmark #1: ./dmd.sh sm.d
> Time (mean ± σ): 436.0 ms ± 14.6 ms [User: 384.0 ms,
> System: 51.8 ms]
> Range (min … max): 413.5 ms … 476.7 ms 40 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
> Time (mean ± σ): 219.9 ms ± 5.6 ms [User: 210.0 ms,
> System: 10.0 ms]
> Range (min … max): 208.6 ms … 235.9 ms 40 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
> Time (mean ± σ): 330.4 ms ± 7.1 ms [User: 290.8 ms,
> System: 39.5 ms]
> Range (min … max): 316.6 ms … 345.7 ms 40 runs
>
> Summary
> './dmd.sh sm.d -version=DotDotDot' ran
> 1.50 ± 0.05 times faster than './dmd.sh sm.d
> -version=Walter'
> 1.98 ± 0.08 times faster than './dmd.sh sm.d'
>
> ---
>
> If you care to compare the timings at the bottom they pretty
> much match the results I've measured in the benchmark
> previously and posted in the thread above.
>
> Now let's see how the test performs with the patches applied.
>
> Benchmark #1: ./dmd.sh sm.d
> Time (mean ± σ): 423.5 ms ± 8.9 ms [User: 377.6 ms,
> System: 45.7 ms]
> Range (min … max): 411.0 ms … 444.0 ms 40 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
> Time (mean ± σ): 231.0 ms ± 4.3 ms [User: 220.3 ms,
> System: 10.9 ms]
> Range (min … max): 223.3 ms … 243.9 ms 40 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
> Time (mean ± σ): 342.4 ms ± 8.1 ms [User: 306.4 ms,
> System: 36.0 ms]
> Range (min … max): 331.0 ms … 375.1 ms 40 runs
>
> Summary
> './dmd.sh sm.d -version=DotDotDot' ran
> 1.48 ± 0.04 times faster than './dmd.sh sm.d
> -version=Walter'
> 1.83 ± 0.05 times faster than './dmd.sh sm.d'
>
> We see the difference between `...` and Walters unrolled
> staticMap shrink.
> And we go see a decrease of the divide and conquer version of
> staticMap.
>
> However We do see the mean times of the `...` and Walters
> unrolled staticMap actually increase.
>
> That made me curious and I repeated the measurements with a
> higher repetition count. Just to make sure that this is not a
> spur of the moment thing.
>
> ---- "short-circuit" patch applied.
> uplink at uplink-black:~/d/dmd-master/dmd(manudotdotdot)$
> hyperfine "./dmd.sh sm.d" "./dmd.sh sm.d -version=DotDotDot"
> "./dmd.sh sm.d -version=Walter" -r 90
> Benchmark #1: ./dmd.sh sm.d
> Time (mean ± σ): 425.9 ms ± 13.4 ms [User: 373.7 ms,
> System: 51.9 ms]
> Range (min … max): 409.6 ms … 468.8 ms 90 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
> Time (mean ± σ): 234.3 ms ± 9.5 ms [User: 224.1 ms,
> System: 10.2 ms]
> Range (min … max): 220.0 ms … 272.1 ms 90 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
> Time (mean ± σ): 340.6 ms ± 7.1 ms [User: 299.7 ms,
> System: 40.9 ms]
> Range (min … max): 328.9 ms … 359.3 ms 90 runs
>
> Summary
> './dmd.sh sm.d -version=DotDotDot' ran
> 1.45 ± 0.07 times faster than './dmd.sh sm.d
> -version=Walter'
> 1.82 ± 0.09 times faster than './dmd.sh sm.d'
>
> This is consistent with what we got before.
> For good measure (pun intended), I tested the DMD version
> without the patch with an increased repetition count as well.
>
> ---- without "short-circuit" patch:
> uplink at uplink-black:~/d/dmd-master/dmd(manudotdotdot)$
> hyperfine "./dmd.sh sm.d" "./dmd.sh sm.d -version=DotDotDot"
> "./dmd.sh sm.d -version=Walter" -r 90
> Benchmark #1: ./dmd.sh sm.d
> Time (mean ± σ): 428.9 ms ± 11.3 ms [User: 376.2 ms,
> System: 52.3 ms]
> Range (min … max): 412.8 ms … 464.5 ms 90 runs
>
> Benchmark #2: ./dmd.sh sm.d -version=DotDotDot
> Time (mean ± σ): 217.8 ms ± 5.2 ms [User: 208.9 ms,
> System: 9.0 ms]
> Range (min … max): 209.0 ms … 241.6 ms 90 runs
>
> Benchmark #3: ./dmd.sh sm.d -version=Walter
> Time (mean ± σ): 329.9 ms ± 9.4 ms [User: 287.8 ms,
> System: 41.9 ms]
> Range (min … max): 318.7 ms … 364.6 ms 90 runs
>
> Summary
> './dmd.sh sm.d -version=DotDotDot' ran
> 1.51 ± 0.06 times faster than './dmd.sh sm.d
> -version=Walter'
> 1.97 ± 0.07 times faster than './dmd.sh sm.d'
>
> The results seem quite solid.
> At leasr on benchmark I have used for "short-circuiting"
> AliasSeq leads to a 4% slowdown for walters unrolled staticMap.
> and a 7% slowdown for `...`
>
> I think we should not assumed include performance improvements
> before measuring.
The reason the old version of staticMap did not see the slowdown
is because I didn't disable codegen.
code-gen inefficiencies occurring when emitting an unreasonable
number to symbols tend to hide other problems.
Here is a Benchmark which does not relay on our branch
But uses the released dmd 2.092.0
Enjoy!
uplink at uplink-black:~/d/dmd-master/dmd(stable)$ hyperfine
"./dmd_without_patch sm.d -c -o- -version=Walter"
"./dmd_with_patch sm.d -c -o- -version=Walter" -m 90
Benchmark #1: ./dmd_without_patch sm.d -c -o- -version=Walter
Time (mean ± σ): 452.8 ms ± 7.5 ms [User: 415.5 ms,
System: 37.4 ms]
Range (min … max): 442.2 ms … 483.9 ms 90 runs
Benchmark #2: ./dmd_with_patch sm.d -c -o- -version=Walter
Time (mean ± σ): 455.1 ms ± 10.4 ms [User: 417.3 ms,
System: 37.7 ms]
Range (min … max): 441.5 ms … 489.2 ms 90 runs
Summary
'./dmd_without_patch sm.d -c -o- -version=Walter' ran
1.00 ± 0.03 times faster than './dmd_with_patch sm.d -c -o-
-version=Walter'
uplink at uplink-black:~/d/dmd-master/dmd(stable)$ hyperfine
"./dmd_without_patch sm.d -c -o-" "./dmd_with_patch sm.d -c -o-"
-m 90
Benchmark #1: ./dmd_without_patch sm.d -c -o-
Time (mean ± σ): 583.2 ms ± 11.0 ms [User: 529.9 ms,
System: 53.1 ms]
Range (min … max): 570.0 ms … 631.0 ms 90 runs
Benchmark #2: ./dmd_with_patch sm.d -c -o-
Time (mean ± σ): 584.3 ms ± 14.3 ms [User: 533.1 ms,
System: 51.0 ms]
Range (min … max): 566.5 ms … 657.9 ms 90 runs
Summary
'./dmd_without_patch sm.d -c -o-' ran
1.00 ± 0.03 times faster than './dmd_with_patch sm.d -c -o-'
uplink at uplink-black:~/d/dmd-master/dmd(stable)$ hyperfine
"./dmd_without_patch sm.d -c -o-" "./dmd_with_patch sm.d -c -o-"
-m 90
Benchmark #1: ./dmd_without_patch sm.d -c -o-
Time (mean ± σ): 583.4 ms ± 10.5 ms [User: 529.2 ms,
System: 54.0 ms]
Range (min … max): 566.9 ms … 624.0 ms 90 runs
Benchmark #2: ./dmd_with_patch sm.d -c -o-
Time (mean ± σ): 585.9 ms ± 13.9 ms [User: 530.5 ms,
System: 55.2 ms]
Range (min … max): 565.0 ms … 631.7 ms 90 runs
Summary
'./dmd_without_patch sm.d -c -o-' ran
1.00 ± 0.03 times faster than './dmd_with_patch sm.d -c -o-'
More information about the Digitalmars-d
mailing list