D Mir: standard deviation speed

Wed Jul 15 05:57:56 UTC 2020

On Tuesday, 14 July 2020 at 19:36:21 UTC, jmh530 wrote:
> On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:
>>   [...]
>
> It would be helpful to provide a link.
>
> You should only need one accumulator for mean and centered sum 
> of squares. See the python example under the Welford example
> https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
> This may have broken optimization somehow.
>
> variance and standardDeviation were recently added to 
> mir.math.stat. They have the option to switch between Welford's 
> algorithm and the others. What you call as the naive algorithm, 
> is VarianceAlgo.twoPass and the Welford algorithm can be 
> toggled with VarianceAlgo.online, which is the default option. 
> It also would be interesting if you re-did the analysis with 
> the built-in mir functions.
>
> There are some other small differences between your 
> implementation and the one in mir, beyond the issue discussed 
> above. You take the absolute value before the square root and 
> force the use of sum!"fast". Another difference is 
> VarianceAlgo.online in mir is using a precise calculation of 
> the mean rather than the fast update that Welford uses. This 
> may have a modest impact on performance, but should provide 
> more accurate results.

Ok, the wiki page looks more informative, I shall look into my 
Welford implementation.

I've just used standardDeviation from Mir and it showed even 
worse results than both of the examples above.

Here is a (WIP) project as of now.
Line 160 in 
https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d

std of [60, 60] matrix 0.0389492 (> 0.001727)
std of [300, 300] matrix 1.03592 (> 0.043452)
std of [600, 600] matrix 4.2875 (> 0.182177)
std of [800, 800] matrix 7.9415 (> 0.345367)