[OT] The Usual Arithmetic Confusions
Adam D Ruppe
destructionator at gmail.com
Fri Feb 4 14:12:05 UTC 2022
On Friday, 4 February 2022 at 04:29:21 UTC, Walter Bright wrote:
> No, then the VRP will emit an error.
No, because you casted it away.
Consider the old code being:
---
struct Thing {
short a;
}
// somewhere very different
Thing calculate(int a, int b) {
return Thing(a + b);
}
---
The current rules would require that you put an explicit cast in
that constructor call. Then, later, Thing gets refactored into
`int`. It will still compile, with the explicit cast still there,
now chopping off bits.
The problem with anything requiring explicit casts is once
they're written, they rarely get unwritten. I tell new users that
`cast` is a code smell - somethings you need it, but it is
usually an indication that you're doing something wrong.
But then you do:
short a;
short b = a + 1;
And suddenly the language requires one.
Yes, I know, there's a carry bit that might get truncated. But
when you're using all `short`, there's probably an understanding
that this is how it works. It's not really that hard - it's about
two or three sentences. As long as one understands 2s-complement
arithmetic.
On the other hand, there might be loss if there's an integer in
there in some kinds of generic code.
I think a reasonable compromise would be to allow implicit
conversions down to the biggest type of the input. The VRP can
apply here on any literals present. Meaning:
short a;
short b = a + 1;
It checks the input:
a = type short
1 = VRP'd down to byte (or bool even)
Biggest type there? short. So it allows implicit conversion down
to short. then VRP can run to further make it smaller:
byte c = (a&0x7e) + 1; // ok the VRP can see it still fits there,
so it goes even smaller.
But since the biggest original input fits in a `short`, it allows
the output to go to `short`, even if there's a carry bit it might
lose.
On the other hand:
ushort b = a + 65535 + 3;
Nope, the compiler can constant fold that literal and VRP will
size it to `int` given its value, so explicit cast required there
to ensure none of the *actual* input is lost.
short a;
short b;
short c = a * b;
I'd allow that. The input is a and b, they're both short, so let
the output truncate back to short implicitly too. Just like with
int, there's some understanding that yes, there is a high word
produced by the multiply, but it might not fit and I don't need
the compiler nagging me like I'm some kind of ignoramus.
This compromise I think would balance the legitimate safety
concerns with accidental loss or refactoring changing things (if
you refactor to ints, now the input type grows and the compiler
can issue an error again) with the annoying casts almost
everywhere.
And by removing most the casts, it makes the ones that remain
stand out more as the potential problems they are.
More information about the Digitalmars-d
mailing list