Implement the "unum" representation in D ?

deadalnix via Digitalmars-d digitalmars-d at puremagic.com
Wed Sep 16 12:21:57 PDT 2015


On Wednesday, 16 September 2015 at 14:11:04 UTC, Ola Fosheim 
Grøstad wrote:
> On Wednesday, 16 September 2015 at 08:38:25 UTC, deadalnix 
> wrote:
>> The energy comparison is bullshit. As long as you haven't 
>> loaded the data, you don't know how wide they are. Meaning you 
>> need either to go pessimistic and load for the worst case 
>> scenario or do 2 round trip to memory.
>
> That really depends on memory layout and algorithm. A likely 
> implementation would be a co-processor that would take a unum 
> stream and then pipe it through a network of cores (tile based 
> co-processor). The internal busses between cores are very very 
> fast and with 256+ cores you get tremendous throughput. But you 
> need a good compiler/libraries and software support.
>

No you don't. Because the streamer still need to load the unum 
one by one. Maybe 2 by 2 with a fair amount of hardware 
speculation (which means you are already trading energy for 
performances, so the energy argument is weak). There is no way 
you can feed 256+ cores that way.

To gives you a similar example, x86 decoding is often the 
bottleneck on an x86 CPU. The number of ALUs in x86 over the past 
decade decreased rather than increased, because you simply can't 
decode fast enough to feed them. Yet, x86 CPUs have a 64 ways 
speculative decoding as a first stage.

>> The hardware is likely to be slower as you'll need way more 
>> wiring than for regular floats, and wire is not only cost, but 
>> also time.
>
> You need more transistors per ALU, but slower does not matter 
> if the algorithm needs bounded accuracy or if it converge more 
> quickly with unums.  The key challenge for him is to create a 
> market, meaning getting the semantics into scientific software 
> and getting initial workable implementations out to scientists.
>
> If there is a market demand, then there will be products. But 
> you need to create the market first. Hence he wrote an easy to 
> read book on the topic and support people who want to implement 
> it.

The problem is not transistor it is wire. Because the damn thing 
is variadic in every ways, pretty much every bit as input can end 
up anywhere in the functional unit. That is a LOT of wire.



More information about the Digitalmars-d mailing list