[OT] The Usual Arithmetic Confusions

Sun Jan 30 10:51:28 UTC 2022

On Saturday, 29 January 2022 at 12:23:28 UTC, Siarhei Siamashka 
wrote:
> performance disadvantages compared to `a[i]`. My guess is that 
> the D language design in this alternative reality would define 
> arrays indexing as a modular operation and consider the "memory 
> safety" goal perfectly achieved (no out of bounds accesses, 
> yay!).

Yes, the focus on memory safety is too narrow and can be counter 
productive.

> I'm not sure if I can fully agree with that. Correctness is a 
> major concern in C++.

It is a concern, but the driving force for switching from 
unsigned int to signed int appears to be more about enabling 
optimization. At least that is my impression.

Another issue is that having multiple integer types can lead to 
multiple instances of templates, which is kinda pointless.

> My understanding is that the primary source of unsigned types 
> in applications is (or at least used to be) the `size_t` type. 
> Which exists, because a memory buffer may technically span over 
> more than half of the address space, at least on a 32-bit 
> system.

Yes, *sizeof* returns *size_t* which is unsigned. I think that 
has more to do with history than practical programming.  But 
there is no reason for modern containers to return unsigned (or 
rather modular ints). Well, other than being consistent with STL, 
but not sure why that would be important.

>> This is something that D should fix!!
>
> Do you have a good suggestion?

How many programs rely on signed wrap-over? Probably none. You 
could just make a breaking change and provide a compiler flag for 
getting the old behaviour? Then another compiler flag for 
trapping on overflow.

> We lost this transparency a long time ago. Compilers are 
> allowed to optimize out big parts of expressions. Integer 
> divisions by a constant are replaced by multiplications and 
> shifts, etc. Functions are inlined, loops are unrolled and/or 
> vectorized.

But you can control most of those by hints in the source-code or 
compilation flags.

There is a big difference between optimizations that lead to 
faster code (or at least code with consistent performance) and 
code gen that leads to uneven performance. Sometimes hardware is 
bad at consistent performance too, like computations with float 
values near zero (denormal numbers), and that is very unpopular 
among realtime/audio-programmers. You don't want the compiler to 
add more such issues.

> Well, the reality is that this is not just a theoretical 
> discussion anymore. Trapping of arithmetic overflows already 
> exists in the existing programming languages. And programming 
> languages will keep evolving to handle it even better in the 
> future.

Yes, but trapping overflows have always existed in languages 
geared towards higher level programming. C and its descendants 
are the outliers.

It is true though that processor speed and branch-prediction has 
made it more attractive also for those that aim at lower level 
programming.

The best solution for a modern language is probably to:

1. Improve the type system so that the compiler more often can 
prove that overflow never can happen for an expression. This can 
also lead to better optimizations.

2. Make signed overflow checks the default, but provide an inline 
annotation to disable it.

I think in general that optimizing all code paths for performance 
is kinda pointless. Usually critical performance is limited to a 
smaller set of functions.