A Value Range Propagation usage example, and more
bearophile via Digitalmars-d
digitalmars-d at puremagic.com
Wed Oct 8 05:17:35 PDT 2014
This is the first part of a function to convert to base 58 (some
letters are missing, like the upper case "I") used in the Bitcoin
protocol:
alias Address = ubyte[1 + 4 + RIPEMD160_digest_len];
char[] toBase58(ref Address a) pure nothrow @safe {
static immutable symbols = "123456789" ~
"ABCDEFGHJKLMNPQRSTUVWXYZ" ~
"abcdefghijkmnopqrstuvwxyz";
static assert(symbols.length == 58);
auto result = new typeof(return)(34);
foreach_reverse (ref ri; result) {
uint c = 0;
foreach (ref ai; a) {
c = c * 256 + ai;
ai = cast(ubyte)(c / symbols.length);
c %= symbols.length;
}
ri = symbols[c];
}
...
}
The D type system isn't smart enough to see that "ai" is always
fitting in an ubyte, so I have had to use a cast(ubyte). But
casts are dangerous and their usage should be minimized, and
to!ubyte is slow and makes the function not nothrow. So I've
rewritten the code like this with a bit of algebraic rewriting:
char[] toBase58(ref Address a) pure nothrow @safe {
static immutable symbols = "123456789" ~
"ABCDEFGHJKLMNPQRSTUVWXYZ" ~
"abcdefghijkmnopqrstuvwxyz";
static assert(symbols.length == 58);
auto result = new typeof(return)(34);
foreach_reverse (ref ri; result) {
uint c = 0;
foreach (ref ai; a) {
immutable d = (c % symbols.length) * 256 + ai;
ai = d / symbols.length;
c = d;
}
ri = symbols[c % symbols.length];
}
...
}
Now it can be a little slower because the integer division and
modulus has different divisors, so perhaps they can't be
implemented with a little more than a single division, as before
(I have not compared the assembly), but for the purposes of this
code the performance difference is not a problem. Now the D type
system is able to see that "ai" is always fitting in a ubyte, and
there's no need for a cast. The compiler puts a safe implicit
cast. This is awesome.
- - - - - - - - - - - - - -
But of course you often want more :-)
This is another case where the current D type system allows you
to avoid a cast:
void main() {
char['z' - 'a' + 1] arr;
foreach (immutable i, ref c; arr)
c = 'a' + i;
}
But if you want to use ranges and functional UFCS chains you
currently need the cast:
void main() {
import std.range, std.algorithm, std.array;
char[26] arr = 26
.iota
.map!(i => cast(char)('a' + i))
.array;
}
In theory this program has the same compile-time information as
the foreach case. In practice foreach is a built-in that enjoys
more semantics than a iota+map.
Currently iota(26) loses the compile-time information about the
range, so you can't do (note: the "max" attribute doesn't exists):
void main() {
import std.range: iota;
auto r = iota(26);
enum ubyte m = r.max;
}
Currently the only way to keep that compile-time information is
to use a template argument:
void main() {
import std.range: tIota;
auto r = tIota!26;
enum ubyte m = r.max; // OK
}
But even if you write such tIota range, the map!() will lose the
compile-time value range information. And even if you manage to
write a map!() able to do it with template arguments, you have
template bloat.
So there's a desire to manage the compile-time information (like
value range information) of (number) literals without causing
template bloat and without the need to use explicit template
arguments.
Bye,
bearophile
More information about the Digitalmars-d
mailing list