[OT] The Usual Arithmetic Confusions
Siarhei Siamashka
siarhei.siamashka at gmail.com
Sat Jan 29 12:23:28 UTC 2022
On Saturday, 29 January 2022 at 10:39:10 UTC, Ola Fosheim Grøstad
wrote:
> As for practical problems: In C++ you can still enable signed
> overflow checks. In D you cannot even do that, because in D all
> integer operations are defined to be modular.
Yes, this is a problem and C++ is doing a somewhat better job
here.
Imagine an alternative reality, where processors supported
efficient modular indexing in arrays and had special instructions
for this (specialized DSP processors might have this even now in
some form). So that `a[i % a.length]` has no performance
disadvantages compared to `a[i]`. My guess is that the D language
design in this alternative reality would define arrays indexing
as a modular operation and consider the "memory safety" goal
perfectly achieved (no out of bounds accesses, yay!). Ignoring
the fact that landing on a wrong array index is a source of all
kind of hard to debug problems, leading to wrong computations or
even security vulnerabilities.
> Interestingly, the main complaint about modular arithmetics in
> C++ is not about correctness however, but about poor
> optimizations.
I'm not sure if I can fully agree with that. Correctness is a
major concern in C++.
> As a result, thoughtfully designed C++ libraries avoid using
> unsigned integers in interfaces.
My understanding is that the primary source of unsigned types in
applications is (or at least used to be) the `size_t` type. Which
exists, because a memory buffer may technically span over more
than half of the address space, at least on a 32-bit system. API
functions, which process memory buffers, have to deal with
`size_t` and this is a headache. But now with 64-bit processors,
this is less of an issue and buffer sizes can become signed with
no practical loss of functionality.
> Why are modular arithmetics performing worse than regular
> arithmetics? It is often much more difficult (in some cases
> impossible) to optimize computations that are mapped onto a
> circle than computations that are mapped onto a straight line!
Yes, things are a bit tricky here. Modular arithmetics has some
really nice math properties:
https://math.stackexchange.com/questions/27336/associativity-commutativity-and-distributivity-of-modulo-arithmetic
When doing modular arithmetics, an expression `a + b - c` can be
always safely transformed into `a - c + b`. But if we are dealing
with potential integer overflow traps, then reordering operations
in an expression can't be done safely anymore (will `a + b`
overflow and cause an exception? or will `a - c` underflow and
cause an exception?).
> This is something that D should fix!!
Do you have a good suggestion?
>> I think that it's only a matter of time until processors start
>> adding the missing instructions to make this fast.
>
> No. It isn't crazy slow because of the check. It is crazy slow
> because it prevents optimizations in complex expressions.
>
> In theory you could compute the overflow as a separate
> expression and do speculative computations, then switch to a
> slow path on overflow, but that would be more of a high level
> approach than a system level approach. In low level programming
> the programmer wants the code to map to machine language
> instructions without blowing up the code size in ways that are
> hard to predict. You want some transparency in how the code you
> write maps to the hardware instructions.
We lost this transparency a long time ago. Compilers are allowed
to optimize out big parts of expressions. Integer divisions by a
constant are replaced by multiplications and shifts, etc.
Functions are inlined, loops are unrolled and/or vectorized.
> To make overflow checks really cast you need a much more
> advanced type system with constraints, so that the compiler can
> know what values an integer picked up from the heap can have.
Well, the reality is that this is not just a theoretical
discussion anymore. Trapping of arithmetic overflows already
exists in the existing programming languages. And programming
languages will keep evolving to handle it even better in the
future.
More information about the Digitalmars-d
mailing list