Integer conversions too pedantic in 64-bit

Wed Feb 16 20:24:52 PST 2011

Am 17.02.2011 05:19, schrieb Kevin Bealer:
> == Quote from spir (denis.spir at gmail.com)'s article
>> On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
>>> On Tuesday, February 15, 2011 15:13:33 spir wrote:
>>>> On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
>>>>> Is there some low level reason why size_t should be signed or something
>>>>> I'm completely missing?
>>>>
>>>> My personal issue with unsigned ints in general as implemented in C-like
>>>> languages is that the range of non-negative signed integers is half of the
>>>> range of corresponding unsigned integers (for same size).
>>>> * practically: known issues, and bugs if not checked by the language
>>>> * conceptually: contradicts the "obvious" idea that unsigned (aka naturals)
>>>> is a subset of signed (aka integers)
>>>
>>> It's inevitable in any systems language. What are you going to do, throw away a
>>> bit for unsigned integers? That's not acceptable for a systems language. On some
>>> level, you must live with the fact that you're running code on a specific machine
>>> with a specific set of constraints. Trying to do otherwise will pretty much
>>> always harm efficiency. True, there are common bugs that might be better
>>> prevented, but part of it ultimately comes down to the programmer having some
>>> clue as to what they're doing. On some level, we want to prevent common bugs,
>>> but the programmer can't have their hand held all the time either.
>> I cannot prove it, but I really think you're wrong on that.
>> First, the question of 1 bit. Think at this -- speaking of 64 bit size:
>> * 99.999% of all uses of unsigned fit under 2^63
>> * To benefit from the last bit, you must have the need to store a value 2^63 <=
>> v < 2^64
>> * Not only this, you must step on a case where /any/ possible value for v
>> (depending on execution data) could be >= 2^63, but /all/ possible values for v
>> are guaranteed < 2^64
>> This can only be a very small fraction of cases where your value does not fit
>> in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)?
>> Something like: "what a luck! this value would not (always) fit in 31 bits, but
>> (due to this constraint), I can be sure it will fit in 32 bits (always,
>> whatever input data it depends on).
>> In fact, n bits do the job because (1) nearly all unsigned values are very
>> small (2) the size used at a time covers the memory range at the same time.
>> Upon efficiency, if unsigned is not a subset of signed, then at a low level you
>> may be forced to add checks in numerous utility routines, the kind constantly
>> used, everywhere one type may play with the other. I'm not sure where the gain is.
>> Upon correctness, intuitively I guess (just a wild guess indeed) if unigned
>> values form a subset of signed ones programmers will more easily reason
>> correctly about them.
>> Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-)
>> (*)
>> Denis
>> (*) But you know, when as a young guy you have coded for 8 & 16-bit machines,
>> having 63 or 64...
> 
> If you write low level code, it happens all the time.  For example, you can copy
> memory areas quickly on some machines by treating them as arrays of "long" and
> copying the values -- which requires the upper bit to be preserved.
> 
> Or you compute a 64 bit hash value using an algorithm that is part of some
> standard protocol.  Oops -- requires an unsigned 64 bit number, the signed version
> would produce the wrong result.  And since the standard expects normal behaving
> int64's you are stuck -- you'd have to write a little class to simulate unsigned
> 64 bit math.  E.g. a library that computes md5 sums.
> 
> Not to mention all the code that uses 64 bit numbers as bit fields where the
> different bits or sets of bits are really subfields of the total range of values.
> 
> What you are saying is true of high level code that models real life -- if the
> value is someone's salary or the number of toasters they are buying from a store
> you are probably fine -- but a lot of low level software (ipv4 stacks, video
> encoders, databases, etc) are based on designs that require numbers to behave a
> certain way, and losing a bit is going to be a pain.
> 
> I've run into this with Java, which lacks unsigned types, and once you run into a
> case that needs that extra bit it gets annoying right quick.
> 
> Kevin

It was not proposed to alter ulong (int64), but to only a size_t equivalent. ;)
And I agree that not having unsigned types (like in Java) just sucks.
Wasn't Java even advertised as a programming language for network stuff? Quite
ridiculous without unsigned types..

Cheers,
- Daniel