Collect Statistics efficiently and easily

Brett Brett at gmail.com
Thu Sep 19 04:23:48 UTC 2019


On Tuesday, 17 September 2019 at 14:06:41 UTC, Paul Backus wrote:
> On Tuesday, 17 September 2019 at 01:53:39 UTC, Brett wrote:
>> Many times I have to get statistical info which is simply 
>> compute statistics on a data set that may be generating or 
>> already generated.
>>
>> The code usually is
>>
>> M = max(M, v);
>> m = min(m, v);
>>
>> but other things like standard deviation, mean, etc might need 
>> to be computed.
>>
>> This may need to be done on several data sets simultaneously.
>>
>> is there any way that one could just compute them in one line 
>> that is efficient, probably using ranges? I'd like to avoid 
>> having to loop through a data set multiple times as it would 
>> be quite inefficient.
>
> You can use `std.algorithm.fold` to compute multiple results in 
> a single pass:
>
> auto stats = v.fold!(max, min);
> M = stats[0];
> m = stats[1];

That may work but I'm already iterating and doing it inside a 
loop.

I'm I'm specifically talking about is sort of abstract the 
computation of each statistic type.

If I were to convert my algorithm to be a range then maybe I 
could do similar to what you are saying but I would still require 
using more than min and max(such as avg, std, and others).

It may be viable but I'll have to think about it. I tend to find 
myself writing the same abstract code to compute the same 
statistics quite often(sometimes it deals with a history and 
sometimes not. E.g., I might want to compute the average and keep 
the last 5, or the 5 largest).






More information about the Digitalmars-d-learn mailing list