[OT] The Usual Arithmetic Confusions

Mon Feb 21 05:40:58 UTC 2022

On Friday, 18 February 2022 at 10:13:42 UTC, Timon Gehr wrote:
> On 18.02.22 09:05, Walter Bright wrote:
>> If you've got an array length longer than int.max,
>
> Seems I likely won't have that (compiled with -m32):
>
> ```d
> void main(){
>     import core.stdc.stdlib;
>     import std.stdio;
>     writeln(malloc(size_t(int.max)+1)); // null (int.max works)
>     auto a=new ubyte[](int.max); // out of memory error
> }
> ```

This is either an OS configuration issue (you are exceeding a 
per-process limit for one of the resources) or maybe there could 
be indeed some large arrays handling bugs in the D library.

Dealing with utilizing as much memory as possible on 32-bit 
systems is already an ancient history. Server admins used to 
tweak various things, such PAE or 3.5G/0.5G user/kernel address 
space split, etc. But now none of this really matters anymore, 
because all memory hungry servers moved to 64-bit hardware a long 
time ago.

And before that, there used to be such things as EMS and himem on 
ancient 16-bit MS-DOS systems to use as much memory as possible 
by applications. But modern D compilers even can't generate 
16-bit code and nobody cares today. I guess, some or even many 
people in this forum were born after this stuff became obsolete.

Personally I'm not going to miss anything if sizes of arrays 
change to use a signed type and int.max (or ssize_t.max) becomes 
the official array size limit on 32-bit systems.

> Most of that is not too helpful as it's not exposed by the 
> language. (At least in D, signed arithmetic actually has 2-s 
> complement semantics, but the hardware has some features to 
> make dealing with 2-s complement convenient that are not really 
> exposed by the programming language.)

The hardware only provides the flags register, which can be 
checked for overflows after arithmetic operations. This 
functionality is provided by 
https://dlang.org/phobos/core_checkedint.html in D language, but 
I wouldn't call it convenient. The flags checks in assembly are 
not convenient either.

> In any case, I can get it right, the scenario I had in mind is 
> competent programmers having to spend time debugging a weird 
> issue and then ultimately fix some library dependency that 
> silently acquires funky behavior once the data gets a bit 
> bigger than what's in the unit tests because the library 
> authors blindly followed a `ptrdiff_t` recommendation they once 
> saw in the forums.

I still think that support for trapping arithmetic overflows at 
runtime is a reasonable solution. It can catch a lot of bugs, 
which are very hard to debug using other methods.

For example, there are not too many signed arithmetic overflows 
in phobos. Passing the phobos unit tests with signed overflows 
trapping enabled (the "-ftrapv" option in GDC) only requires some 
minor patches in a few places:

   https://github.com/ssvb/gcc/commits/gdc-ftrapv-phobos-20220209

Most of the affected places in phobos are already marked with 
cautionary comments ("beware of negating int.min", "there was an 
overflow bug, here's a link to bugzilla", etc.). A history of 
blood and suffering is recorded there.

Some people may think that (signed) arithmetic overflows being 
defined to wraparound is a useful feature of the D language and 
some software may rely on it for doing something useful. But I 
don't see any real evidence of that. Most of the silent 
arithmetic overflows look like just yet undetected and 
undesirable bugs. The next step is probably to see how many 
changes are needed in the compiler frontend code to make it 
"-ftrapv" compatible too.

But again, the problem is not technical at all. The problem is 
that too many people are convinced that silent wraparounds are 
good and nothing needs to be changed or improved. Or that 
rudimentary VRP checks at compile time are sufficient.