Differences in results when using the same function in CTFE and Runtime

Tue Aug 13 11:02:07 UTC 2024

On Monday, 12 August 2024 at 11:06:15 UTC, IchorDev wrote:
> On Monday, 12 August 2024 at 10:22:52 UTC, Quirin Schroll wrote:
>> On almost all non-embedded CPUs, doing non-vector calculations 
>> in `float` is more costly than doing them in `double` or 
>> `real` because for single-arguments, the floats are converted 
>> to `double` or `real`. I consider `float` to be a type used 
>> for storing values in arrays that don’t need the precision and 
>> save me half the RAM.
>
> I don’t care. Only one family of CPU architectures supports 
> ‘extended precision’ floats (because it’s a waste of time), so 
> I would like to know a way to always perform calculations with 
> double precision for better cross-platform consistency. Imagine 
> trying to implement JRE without being able to do native double 
> precision maths.

I honestly don’t know how JRE did implement `double` operations 
on e.g. the Intel 80486, but if I try using `gcc -mfpmath=387 
-O3` and add some `double` values, intermediate results are 
stored and loaded again. Very inefficient. If I use `long 
double`, that does not happen.

The assertion that only one CPU family supports extended floats 
is objectively wrong. You probably meant the x86 with its 80-bit 
format, which is still noteworthy, as x86 is very, very common. 
However, at least the POWER9 family supports 128-bit IEEE-754 
quad quadruple-precision floats. IIUC, RISC-V also supports them.