Casts and some suggestions to avoid them

Tue Apr 8 13:33:04 PDT 2014

On Tue, Apr 08, 2014 at 06:38:46PM +0000, bearophile wrote:
[...]
> I've done a little statistics on about 208 casts in code I have written.
[...]
> Of those casts about 73 casts are conversions from a floating point
> value to integral value, like:
> cast(uint)(x * 1.75)
> cast(int)sqrt(real(ns))
> 
> In some cases you can use the to! template instead of cast.

Which cases don't work? My impression is that to! should be preferred to
casts in this case, because it will actually check runtime value ranges
and throw an error if, say, the float exceeds the range of int. Using a
cast will silently ignore overflowed values, leading to hard-to-find
bugs.

[...]
> About 20 casts are for the return type of malloc/calloc/realloc/alloca,
> like:
> cast(ubyte*)alloca(ubyte.sizeof * x);
> cast(T*)malloc(typeof(T).sizeof * 10);
> 
> A set of 3 little wrappers around those functions in Phobos can remove
> those casts (this can't be done with alloca), they are safer than
> using the raw C functions:
> cMalloc!T(n)
> cCalloc!T(n)
> cRealloc(ptr, n)

This issue will (hopefully?) be addressed when Andrei finalizes his
allocators, perhaps?

[...]
> About 14 are reinterpret casts, sometimes to see an uint as a sequence
> of ubytes, array casts, etc:
> cast(ubyte*)&x;
> cast(ubyte[4]*)&data;
> cast(uint[])text.to!(dchar[])
> cast(ubyte[3])[x % 256, y % 256, x % 256]

Reinterpret casts are probably irreplaceable, because often they are
used when you want to directly access the raw representation of some
piece of data (e.g., to transmit a struct over the network, or serialize
it to file, etc.). D does give some useful tools to do this with minimal
risks (e.g., .sizeof), but still, this kind of cast is inherently
dangerous and prone to breakage when you redefine your types.

[...]
> About 6 casts are used to convert an array of enums to an array of the
> underlying type, like:
> 
> enum C : char { A='a', B='b' }
> C[50] arr;
> cast(char[])arr
> 
> Keeping 'arr' as an array of C is handy for safety or for other
> reasons, but perhaps you need to print arr compactly or you need the
> char[] for other reasons.
> 
> I think you can't use to! in this case.

I think to! can probably be extended to perform this conversion.

> About 5 casts are used to convert the result of std.file.read to an
> usable array type (because in some cases readText is not the right
> function to use), like:
> cast(char[])"data1.txt".read
> cast(ubyte[])"data2.txt".read
> 
> The cast can be avoided with  similar function that accepts a template
> type (there are perhaps ways to this with already present Phobos
> functions, suggestions are welcome):
> read!(char[])("data1.txt")

Agreed.

[...]
> About 4 casts are used by hex strings, like:
> 
> ubyte[] data = cast(ubyte[])x"00 11 22 33 AB";
> 
> I think hex strings should be implicitly castable to ubyte[], avoiding
> the need to a cast, or if you don't like implicit casts then I think
> they should be of type ubyte[], because in about 100% of the cases I
> don't want a char[].

Agreed, I can't think of any common use case where you'd want a hex
string to be char[] instead of ubyte[]. The only case I can think of,
(which is not common at all) is when you want to explicitly construct
test cases for UTF strings with specific code point sequences (e.g.,
invalid sequences to test UTF error-catching code).

[...]
> In about 4 cases I have used a cast to take part of a number, like
> taking the lower 32 bits of a ulong, and so on.
> 
> In some cases you can remove such casts using a union (like a union of
> one ulong and a uint[2]).

Using a union here is not a good idea, because the results depend on the
endianness of the machine! It's better to just use (a & 0xFFFF) or (a >>
16) instead.

[...]
> In 2 cases I have had to cast to convert an array length to type uint
> to allow the code compile on both a 32 and 64 bit system, to assign
> such length to some uint value.

This is inherently unsafe, since it risks silent truncation of very
large arrays. Admittedly, that's unlikely on a 32-bit machine, but
still... I think a cast is justified here (as a warning sign that the
code may have fragile behaviour -- e.g., while running on a 64-bit
machine).

[...]
> In 1 case I've had to use a dynamic cast on class instances. In theory
> in Phobos you can add specialized upcasts, downcasts, etc, that are
> more explicit and safer.

In OO, explicit downcasting is usually frowned upon as the sign of bad
design (due to the Liskov Substitution Principle). Nevertheless, AFAIK,
downcasting in D is actually safe:

	BaseClass b;
	auto d = cast(DerivedClass) b;
	if (d is null)
	{
		// b was not an instance of DerivedClass
	}
	else
	{
		// d is safe to use
	}

So I don't think this case counts. The cast operator was explicitly
designed to handle this case (among other cases).

T

-- 
If creativity is stifled by rigid discipline, then it is not true creativity.