Casts, overflows and demonstrations
bearophile
bearophileHUGS at lycos.com
Tue Jun 5 16:12:49 PDT 2012
This is a reduced part of some D code:
import std.bigint, std.conv, std.algorithm, std.range;
void foo(BigInt number)
in {
assert(number >= 0);
} body {
ubyte[] digits = text(number + 1)
.retro()
.map!(c => cast(ubyte)(c - '0'))()
.array();
// ...
}
void main() {}
The important line of code adds one to 'number', converts it to a
string, scans it starting from its end, and for each char (digit)
finds its value, removing the ASCII value of '0', and casts the
result to ubyte. Then converts the lazy range to an array, an
ubyte[].
The cast in the D code is needed because 'c' is a char. If you
remove '0' from a char, in D the result is an int, and D doesn't
allow to assign that int (I guess the compiler performs range
analysis on the expression, so it knows the result can be
negative too) to an ubyte, to avoid losing information.
Casts are dangerous so it's better to avoid them where possible.
A cast looks kind of safe because you usually know what you are
doing while you program. But when later you change other parts of
the code, the cast keeps being silent, and maybe it's not casting
from the type you think it does. Maybe that kind of bugs are
avoided by a templated function like this that makes it explicit
both from and to types (it doesn't compile if the from type is
wrong) (this code is not fully correct, the traits is not working
well):
template Cast(From, To) if (__traits(compiles,
cast(To)From.init)) {
To Cast(T)(T x) if (is(T == From)) {
return cast(To)x;
}
}
void main() {
int x = -100;
ubyte y = Cast!(int, ubyte)(x);
string s = "123";
int y2 = Cast!(string, int)(s);
}
The following code is similar, but to!() performs a run-time test
that makes it sure the subtraction result is representable inside
an ubyte, otherwise throws an exception:
ubyte[] digits = text(number + 1)
.retro()
.map!(c => to!ubyte(c - '0'))()
.array();
That code is safer than the cast, but it performs a run-time test
for each digit, this is not good.
In theory a smarter compiler (working on good enough code) is
able to do better: text() calls a BigInt method that returns the
textual representation of the value in base ten (today such
method is toString(), but maybe this situation will change and
improve). BigInt.toString() could have a post-condition like this:
string toString()
out(result) {
size_t start = 0;
if (this < 0) {
assert(result[0] = '-');
start = 1;
}
foreach (digit; result[start .. $])
assert(digit >= '0' && digit <= '9');
// If you want you can also assert that the first
// digit is zero only if the bigint value is zero.
} body {
// ...
}
Given that information, plus the foo pre-condition
in{assert(number >= 0);}, a smart compiler is able to infer that
(or asks the programmer to demonstrate that) text() returns an
array of just ['0',..,'9'] chars, that retro() doesn't change the
contents of the range, so if you remove '0' from them you get a
number in [0,..,9] that is always representable in an ubyte. So
no cast is needed.
Now and then I take a look at the ongoing development and
refinement of the "Modern Eiffel" language (it's a kind of
Eiffel2, see
http://tecomp.sourceforge.net/index.php?file=doc/papers/lang/modern_eiffel.txt
), that is supposed to be (or become able) to perform those
inferences (or to use them if the programmer has demonstrated
them), so I think it will be able to spare both that cast and the
run-time tests on each char, avoiding overflow bugs.
According to Bertrand Meyer and others in 20 years similar things
are going is going to become a part of the normal programming
experience.
Bye,
bearophile
More information about the Digitalmars-d
mailing list