Integer conversions too pedantic in 64-bit

Tue Feb 15 13:40:11 PST 2011

Am 15.02.2011 20:15, schrieb Rainer Schuetze:
> 
> I think David has raised a good point here that seems to have been lost in the
> discussion about naming.
> 
> Please note that the C name of the machine word integer was usually called
> "int". The C standard only specifies a minimum bit-size for the different types
> (see for example http://www.ericgiguere.com/articles/ansi-c-summary.html). Most
> of current C++ implementations have identical "int" sizes, but now "long" is
> different. This approach has failed and has caused many headaches when porting
> software from one platform to another. D has recognized this and has explicitely
> defined the bit-size of the various integer types. That's good!
> 
> Now, with size_t the distinction between platforms creeps back into the
> language. It is everywhere across phobos, be it as length of ranges or size of
> containers. This can get viral, as everything that gets in touch with these
> values might have to stick to size_t. Is this really desired?
> 
> Consider saving an array to disk, trying to read it on another platform. How
> many bits should be written for the size of that array?
> 

This can indeed be a problem which actually is existent in Phobos: std.streams
Outputstream has a write(char[]) method - and similar methods for wchar and
dchar - that do exactly this: write a size_t first and then the data.. in many
places they used uint instead of size_t, but at the one method where this is a
bad idea they used size_t ;-) (see also
http://d.puremagic.com/issues/show_bug.cgi?id=5001 )

In general I think that you just have to define how you serialize data to
disk/net/whatever (what endianess, what exact types) and you won't have
problems. Just dumping the data to disk isn't portable anyway.

> Consider a range that maps the contents of a file. The file can be larger than
> 4GB, though a lot of the ranges that wrap the file mapping range will truncate
> the length to 32 bit on 32-bit platforms.
> 
> I don't have a perfect solution, but maybe builtin arrays could be limited to
> 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless signed/unsigned
> conversions), so the normal type to be used is still "int". Ranges should adopt
> the type sizes of the underlying objects.
> 
> Agreed, a type for the machine word integer must exist, and I don't care how it
> is called, but I would like to see its usage restricted to rare cases.
> 
> Rainer
> 

Cheers,
- Daniel