Statistics library

Don nospam at nospam.com.au
Sat Oct 25 06:00:00 PDT 2008


dsimcha wrote:
> == Quote from Don (nospam at nospam.com.au)'s article
>>> Binomial, hypergeometric, normal, Poisson, Kolmogorov CDFs, hypergeometric,
>>> Poisson, binomial PDFs.  Inverse normal distribution,
>> Most of these are in Tango (not Kolmogorov). Are yours different in some
>> way?
> 
> They calculate the exact log factorial using a caching scheme.  Not sure how much
> accuracy this actually buys, though it costs some memory.  I should probably
> change the logFactorial function to a gamma approximation at least for large N.

That shouldn't be necessary. If logGamma() isn't giving an accurate 
factorial (within a couple of bits of precision), that's a problem with 
logGamma. Please generate a bug report.

> Also, Tango doesn't have hypergeometric.

You're right. It's still on my hard disk, I wasn't quite happy it.

> 
>>> A struct to generate all possible permutations of a sequence.
>>  >
>>  > Correlation (Pearson, Spearman rho, Kendall tau).   Note that the
>>   Kendall
>>  > tau correlation is a very efficient O(N log N) version.
>>  >
>>  > Mean, standard deviation, variance, kurtosis, percent variance for
>> arrays of
>>  > numeric values.
>>  >
>>  > Shannon entropy, mutual information.
>>  >
>>  > Kolmogorov-Smirnov tests
>> Sounds good.
> 
> This is more the part that I thought might be useful.
> 
> 
>>> On the other hand, I'm a scientist, not a full-time programmer,
>> Me too!
>>> Is there any interest in this from others in the D community?  Do other people
>>> think that D would benefit from having a decent statistics library?
>> Yes. Which is why I put the existing stuff into Tango.
> 
> BCS has offered me Scrapple access, I'll post the code there under a permissive
> license.  From there, Tango and Phobos devs can look at it and do as they see fit.
Cool!


More information about the Digitalmars-d-announce mailing list