Implementing Half Floats in D

Thu Jan 31 07:38:03 PST 2013

On Thursday, 31 January 2013 at 13:41:13 UTC, Andrei Alexandrescu 
wrote:
> On 1/31/13 5:18 AM, Don wrote:
>> std.numeric is not superficially flawed, it's fundamentally 
>> flawed. What
>> is it for? What is its theme? The problem is, std.numeric is 
>> one of the
>> few good names which are left as a possible package name, 
>> after C
>> insulted the mathematical community by creating a module 
>> called 'math'.
>
> Guilty as charged. I've put stuff in std.numeric as I was 
> working on my thesis. I recall you added some stuff there too. 
> As I'm sure you remember the state of D in 2007 was rather 
> different than that of today. Overall no need to get agitated 
> here, we're all on the same boat and aiming for the same shore.

Sorry if that came across as agitated, it wasn't intended to be.
As you noted, I have code in there as well.
It's just one of those old modules that needs to be cleaned up, 
though it reveals a deeper issue - see below.

> Let's see what we have there:
>
> entropy
> CustomFloat
> kullbackLeiblerDivergence
> Fft
> gapWeightedSimilarityIncremental
> gapWeightedSimilarity
> gapWeightedSimilarityNormalized
> FPTemporary
> findRoot
> euclideanDistance
> dotProduct
> cosineSimilarity
> gcd
> jensenShannonDivergence
> normalize
> secantMethod
>
> The general theme is obvious - numeric algorithms and data 
> structures. Many are obvious and with obvious utility to one 
> interested in numerics: entropy, various distance and 
> similarity measures. I think you wrote findRoot.

Yes.
The basic problem is that there are hundreds of potential numeric 
algorithms and data structures of equal importance to these ones. 
In fact, the total number of mathematical algorithms is probably 
a substantial fraction of the total algorithms in computer 
science!

Even a module which contained only FFT, could be quite large, 
once it included all the important related transforms.

> The gapWeightedSimilarity algorithms are string kernels. They 
> are somewhat niche but quite powerful to anyone interested in 
> string similarity (technically they are string edit distance on 
> steroids). They might belong in std.string but I figured they 
> have enough numeric algorithm flavor to put them in there.
>
> So let's itemize the grievances and see how we can sort this 
> out.

I'm not sure that we can solve this without addressing the 
high-level question: What is the scope of Phobos?

How big will it eventually get? Twice its current size? Ten 
times? A hundred times?

Both SmallPhobos and LargePhobos are reasonable, but we do have 
to pick one. Currently we have aspects of both approaches, but 
they aren't compatible.

The current approach of putting everything directly into a single 
level in std doesn't scale very far -- it will look very clumsy 
once it gets more than (say) three times larger. This argues for 
SmallPhobos.

But if it doesn't get to be at least ten times larger, some of 
this niche stuff shouldn't be in there, they are functions from 
LargePhobos. If we go with SmallPhobos then we need to move the 
niche stuff somewhere else.