Statistics library

dsimcha dsimcha at yahoo.com
Fri Oct 24 06:50:31 PDT 2008


== Quote from Don (nospam at nospam.com.au)'s article
> > Binomial, hypergeometric, normal, Poisson, Kolmogorov CDFs, hypergeometric,
> > Poisson, binomial PDFs.  Inverse normal distribution,
> Most of these are in Tango (not Kolmogorov). Are yours different in some
> way?

They calculate the exact log factorial using a caching scheme.  Not sure how much
accuracy this actually buys, though it costs some memory.  I should probably
change the logFactorial function to a gamma approximation at least for large N.

Also, Tango doesn't have hypergeometric.

> > A struct to generate all possible permutations of a sequence.
>  >
>  > Correlation (Pearson, Spearman rho, Kendall tau).   Note that the
>   Kendall
>  > tau correlation is a very efficient O(N log N) version.
>  >
>  > Mean, standard deviation, variance, kurtosis, percent variance for
> arrays of
>  > numeric values.
>  >
>  > Shannon entropy, mutual information.
>  >
>  > Kolmogorov-Smirnov tests
> Sounds good.

This is more the part that I thought might be useful.


> > On the other hand, I'm a scientist, not a full-time programmer,
> Me too!
> > Is there any interest in this from others in the D community?  Do other people
> > think that D would benefit from having a decent statistics library?
> Yes. Which is why I put the existing stuff into Tango.

BCS has offered me Scrapple access, I'll post the code there under a permissive
license.  From there, Tango and Phobos devs can look at it and do as they see fit.





More information about the Digitalmars-d-announce mailing list