Integer conversions too pedantic in 64-bit

Tue Feb 15 11:15:06 PST 2011

I think David has raised a good point here that seems to have been lost 
in the discussion about naming.

Please note that the C name of the machine word integer was usually 
called "int". The C standard only specifies a minimum bit-size for the 
different types (see for example 
http://www.ericgiguere.com/articles/ansi-c-summary.html). Most of 
current C++ implementations have identical "int" sizes, but now "long" 
is different. This approach has failed and has caused many headaches 
when porting software from one platform to another. D has recognized 
this and has explicitely defined the bit-size of the various integer 
types. That's good!

Now, with size_t the distinction between platforms creeps back into the 
language. It is everywhere across phobos, be it as length of ranges or 
size of containers. This can get viral, as everything that gets in touch 
with these values might have to stick to size_t. Is this really desired?

Consider saving an array to disk, trying to read it on another platform. 
How many bits should be written for the size of that array?

Consider a range that maps the contents of a file. The file can be 
larger than 4GB, though a lot of the ranges that wrap the file mapping 
range will truncate the length to 32 bit on 32-bit platforms.

I don't have a perfect solution, but maybe builtin arrays could be 
limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless 
signed/unsigned conversions), so the normal type to be used is still 
"int". Ranges should adopt the type sizes of the underlying objects.

Agreed, a type for the machine word integer must exist, and I don't care 
how it is called, but I would like to see its usage restricted to rare 
cases.

Rainer

dsimcha wrote:
> Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
> of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
> pedantic when it comes to implicit conversions (or lack thereof) of
> array.length.  99.999% of the time it's safe to assume an array is not going
> to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
> time than deal with the pedantic errors the rest of the time, because I think
> it would be less total time and effort invested.  To force me to either put
> casts in my code everywhere or change my entire codebase to use wider integers
> (with ripple effects just about everywhere) strikes me as purity winning out
> over practicality.