[core.reflect] showcase fqn

Thu Oct 7 13:16:34 UTC 2021

On Thursday, 7 October 2021 at 10:36:42 UTC, Stefan Koch wrote:
> TLDR; a non-optimized fqn using core.reflect is roughly 4 times 
> faster than the phobos version.
>

I went ahead and did a test on a somewhat bigger (auto-generated) 
testcase.

TLDR: On bigger testcases where the constant overhead is less of 
a factor, `core.reflect` is roughly `11.5 times` faster

To see if the the initial overhead the phobos version has would 
make it perform better once it's run multiple times.

Here are the results

```
uplink at uplink-black:~/d/dmd(core_reflect)$ hyperfine 
"generated/linux/release/64/dmd nested_structs.di -o- 
-version=phobos_fqn" "generated/linux/release/64/dmd 
nested_structs.di -o- -version=core_reflect" 
"generated/linux/release/64/dmd nested_structs.di -o- 
-version=no_fqn"
Benchmark #1: generated/linux/release/64/dmd nested_structs.di 
-o- -version=phobos_fqn
   Time (mean ± σ):      6.307 s ±  0.155 s    [User: 5.596 s, 
System: 0.706 s]
   Range (min … max):    6.138 s …  6.637 s    10 runs

Benchmark #2: generated/linux/release/64/dmd nested_structs.di 
-o- -version=core_reflect
   Time (mean ± σ):     792.2 ms ±  14.7 ms    [User: 716.4 ms, 
System: 75.3 ms]
   Range (min … max):   777.0 ms … 827.3 ms    10 runs

   Warning: Statistical outliers were detected. Consider 
re-running this benchmark on a quiet PC without any interferences 
from other programs. It might help to use the '--warmup' or 
'--prepare' options.

Benchmark #3: generated/linux/release/64/dmd nested_structs.di 
-o- -version=no_fqn
   Time (mean ± σ):     316.2 ms ±   5.7 ms    [User: 257.9 ms, 
System: 58.2 ms]
   Range (min … max):   311.5 ms … 329.0 ms    10 runs

Summary
   'generated/linux/release/64/dmd nested_structs.di -o- 
-version=no_fqn' ran
     2.51 ± 0.07 times faster than 'generated/linux/release/64/dmd 
nested_structs.di -o- -version=core_reflect'
    19.95 ± 0.61 times faster than 'generated/linux/release/64/dmd 
nested_structs.di -o- -version=phobos_fqn'
  ```

Note that the `-version=no_fqn` doesn't do any fqn computation 
and merely parses the nested structs.
That is such that you can get an idea of the constant overhead 
which does not go away.

If we factor that in we end up with the following numbers.
`baseline: no_fqn_min = 311 ms` -- that's the time to parse and 
semantically the file essentially.
```
core_reflect_max = 827.3 ms
phobos_fqn_min = 6307 ms
```
to adjust for the overhead we now subtract 311 from both values 
and get
```
core_reflect_self = 516.3
phobos_fqn_self = 6392
real_speedup = 6392 / 516.3 = ~12
```

Which shows that in reality `core.reflect` is 12 times faster.

Cheers,
Stefan

P.S. tests on even larger test-cases suggest that the real 
speedup drops down to `11.5`

In order for you to be able to verify at least the phobos_fqn and 
the no_fqn timings, I have published my testcase in the following 
gist:
https://gist.github.com/UplinkCoder/5acd25168238cac179a5c4ffdf945187

Memory use is `10 times` less in the absolute measurement
And `30 times` less if corrected for the no_fqn version as 
baseline