Integer conversions too pedantic in 64-bit

Wed Feb 16 03:21:06 PST 2011

spir wrote:
> On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
>> On Tuesday, February 15, 2011 15:13:33 spir wrote:
>>> On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
>>>> Is there some low level reason why size_t should be signed or something
>>>> I'm completely missing?
>>>
>>> My personal issue with unsigned ints in general as implemented in C-like
>>> languages is that the range of non-negative signed integers is half 
>>> of the
>>> range of corresponding unsigned integers (for same size).
>>> * practically: known issues, and bugs if not checked by the language
>>> * conceptually: contradicts the "obvious" idea that unsigned (aka 
>>> naturals)
>>> is a subset of signed (aka integers)
>>
>> It's inevitable in any systems language. What are you going to do, 
>> throw away a
>> bit for unsigned integers? That's not acceptable for a systems 
>> language. On some
>> level, you must live with the fact that you're running code on a 
>> specific machine
>> with a specific set of constraints. Trying to do otherwise will pretty 
>> much
>> always harm efficiency. True, there are common bugs that might be better
>> prevented, but part of it ultimately comes down to the programmer 
>> having some
>> clue as to what they're doing. On some level, we want to prevent 
>> common bugs,
>> but the programmer can't have their hand held all the time either.
> 
> I cannot prove it, but I really think you're wrong on that.
> 
> First, the question of 1 bit. Think at this -- speaking of 64 bit size:
> * 99.999% of all uses of unsigned fit under 2^63
> * To benefit from the last bit, you must have the need to store a value 
> 2^63 <= v < 2^64
> * Not only this, you must step on a case where /any/ possible value for 
> v (depending on execution data) could be >= 2^63, but /all/ possible 
> values for v are guaranteed < 2^64
> This can only be a very small fraction of cases where your value does 
> not fit in 63 bits, don't you think. Has it ever happened to you (even 
> in 32 bits)? Something like: "what a luck! this value would not (always) 
> fit in 31 bits, but (due to this constraint), I can be sure it will fit 
> in 32 bits (always, whatever input data it depends on).
> In fact, n bits do the job because (1) nearly all unsigned values are 
> very small (2) the size used at a time covers the memory range at the 
> same time.
> 
> Upon efficiency, if unsigned is not a subset of signed, then at a low 
> level you may be forced to add checks in numerous utility routines, the 
> kind constantly used, everywhere one type may play with the other. I'm 
> not sure where the gain is.
> Upon correctness, intuitively I guess (just a wild guess indeed) if 
> unigned values form a subset of signed ones programmers will more easily 
> reason correctly about them.
> 
> Now, I perfectly understand the "sacrifice" of one bit sounds like a 
> sacrilege ;-)
> (*)
> 
> Denis
> 

> (*) But you know, when as a young guy you have coded for 8 & 16-bit 
> machines, having 63 or 64...

Exactly. It is NOT the same as the 8 & 16 bit case. The thing is, the 
fraction of cases where the MSB is important has been decreasing 
*exponentially* from the 8-bit days. It really was necessary to use the 
entire address space (or even more, in the case of segmented 
architecture on the 286![1]) to measure the size of anything. D only 
supports 32 bit and higher, so it isn't hamstrung in the way that C is.

Yes, there are still cases where you need every bit. But they are very, 
very exceptional -- rare enough that I think the type could be called 
__uint, __ulong.

[1] What was size_t on the 286 ?
Note that in the small memory model (all pointers 16 bits) it really was 
possible to have an object of size 0xFFFF_FFFF, because the code was in 
a different address space.