Differences in results when using the same function in CTFE and Runtime
Quirin Schroll
qs.il.paperinik at gmail.com
Sat Aug 17 16:33:15 UTC 2024
On Thursday, 8 August 2024 at 10:31:32 UTC, Carsten Schlote wrote:
> Hi
>
> I'm playing with CTFE in D. This feature allows for a lot of
> funny things, e.g. initialisation of immutable data at compile
> time with the result of some other function (template).
>
> As a result I get immutable result blobs compiled into the
> binary. But none of the generating code, because it was already
> executed by CTFE.
>
> This worked nicly for several other usecases as well.For now
> the results of CTFE and RT were always the same. As expected.
>
> However, yesterday a unit-test started to report, that the
> results created by the same code with same parameters differ
> when run in CTFE mode or at runtime.
>
> ```D
> static immutable ubyte[] burningShipImageCTFE =
> generateBurningShipImage(twidth, theight, maxIter);
> immutable ubyte[] burningShipImageRT =
> generateBurningShipImage(twidth, theight, maxIter);
> assert(burningShipImageCTFE == burningShipImageRT, "Same
> results expected.");
>
> ```
> I diffed the pictures and indeed some of the pixels in the more
> complex areas of the BurningShip fractal were clearly and
> noteably different.
>
> Ok, the fractal code uses 'double' floats, which are by their
> very nature limited in precision. But assuming that the math
> emulation of CTFE works identical as the CPU does at runtime,
> the outcome should be identical.
>
> Or not, in some cases ;-) E.g. with a fractal equation where
> smallest changes can result into big differences.
>
> And it opens up some questions:
>
> - Can CTFE be used under all circumstances when float numbers
> of any precision are involved?
> - Or is this some kind of expected behaviour whenever floats
> are involved?
> - Is the D CTFE documentation completely covering such possible
> issues?
>
> I can imagine that bugs causes by such subtil differences might
> be very difficult to fix.
>
>
> Any experiences or thought on this?
Experiences, little, as I'm not doing floating-point stuff
professionally, but I know my stuff because in the past, for some
years, I did the lecture assistance for a numerical programming
course.
The normal use case for floating-point isn't perfectly
reproducible results between different optimization levels.
However, differences between CTFE and RT are indeed unacceptable
for core-language operations. Those are bugs. Of course, results
in user code can differ between CTFE and RT due to using `__ctfe`
incorrectly. It might be noteworthy that C++ (at least up to
including C++17) does not allow floating-point types in CTFE
(i.e. in `constexpr` execution) and my suspicion is that this is
the reason.
Maybe the solution is the same: Remove floating-point operations
from CTFE or at least ones that could differ from RT. It would be
awkward, at last in some people's opinion, because that would
mean that at CTFE, only `real` is available, despite it being
implementation defined (it's not just platform dependent, it's
also how expressions are interpreted), while `double` and `float`
being seemingly exactly defined. The reason is that `real` can't
be optimized to higher precision as it's the highest precision
format supposed by the platform by definition, whereas for the
smaller formats, RT results may differ for different optimization
levels. What the compiler could do, however is replace `a + b *
c` by a fused multiply add. If it does that consistently across
CTFE and RT, as I read the spec, it would be allowed to do that,
even for `real`.
The reason D specifies floating-point operations not that
precisely is to allow for optimizations. Generally speaking,
optimizations require some leeway in the spec. Optimizations are
also required not to change observable behavior, but what counts
as that is again up to the spec. In C++, the compiler is allowed
to optimize away copies, even if those would invoke a copy
constructor that has observable side-effects. As I see it (not my
opinion, just what I see and conclude), D specifies differences
in floating-point operations due to them being carried out in
higher-than-required precision not an obersvable side-effect,
i.e. one that the optimizer must preserve, even if you can
practically observe a difference.
The reason for that is probably because Walter didn't like that
other languages nailed down floating-point operations so that
you'd get both less precise results *and* worse performance. That
would for example be the case on an 80387 coprocessor, and
(here's where my knowledge ends) probably also true for basically
all hardware today if you consider `float` specifically. I know
of no hardware, that supports single precision, but not double
precision. Giving you double precision instead of single is at
least basically free and possibly even a performance boost, while
also giving you more precision.
An algorithm like Kahan summation must be implemented in a way
that takes those optimizations into account. This is exactly like
in C++, signed integer overflow is undefined, not because it's
undefined on the hardware, but because it allows for
optimizations. In Zig, all integer overflow is undefined for that
reason, and for wrap-around or saturated arithmetic, there are
separate operators. D could easily add specific functions to
`core.math` that specify operations as specifically IEEE-754
confirming. Using those, Phobos could give you types that are
specified to produce results as specified by IEEE-754, with no
interference by the optimizer. You can't actually do the reverse,
i.e. provide a type in Phobos that allows for optimizations of
that sort but the core-language types are guaranteed to be
unoptimized. Such a type would have to be compiler-recognized,
i.e. it would end up being a built-in type.
More information about the Digitalmars-d
mailing list