Treating the abusive unsigned syndrome

Thu Nov 27 09:15:21 PST 2008

Denis Koroskin wrote:
> 27.11.08 в 03:46 Sean Kelly в своём письме писал(а):
> 
>> Andrei Alexandrescu wrote:
>>> Sean Kelly wrote:
>>>> Don wrote:
>>>>>
>>>>> Although it would be nice to have a type which was range-limited, 
>>>>> 'uint' doesn't do it. Instead, it guarantees the number is between 
>>>>> 0 and int.max*2+1 inclusive. Allowing mixed operations encourages 
>>>>> programmers to focus the benefit of 'the lower bound is zero!' 
>>>>> while forgetting that there is an enormous downside ('I'm saying 
>>>>> that this could be larger than int.max!')
>>>>
>>>> This inspired me to think about where I use uint and I realized that 
>>>> I don't.  I use size_t for size/length representations (largely 
>>>> because sizes can theoretically be >2GB on a 32-bit system), and 
>>>> ubyte for bit-level stuff, but that's it.
>>>  For the record, I use unsigned types wherever there's a non-negative 
>>> number involved (e.g. a count). So I'd be helped by better unsigned 
>>> operations.
>>
>> To be fair, I generally use unsigned numbers for values that are 
>> logically always positive.  These just tend to be sizes and counts in 
>> my code.
>>
>>> I wonder how often these super-large arrays do occur on 32-bit 
>>> systems. I do have programs that try to allocate as large a 
>>> contiguous matrix as possible, but never sat down and tested whether 
>>> a >2GB chunk was allocated on the Linux cluster I work on. I'm quite 
>>> annoyed by this >2GB issue because it's a very practical and very 
>>> rare issue in a weird contrast with a very principled issue (modeling 
>>> natural numbers).
>>
>> Yeah, I have no idea how common they are, though my guess would be 
>> that they are rather uncommon.  As a library programmer, I simply must 
>> assume that they are in use, which is why I use size_t as a matter of 
>> course.
> 
> If they can be more than 2Gb, why can't they be more than 4GB? It is 
> dangerous to assume that they won't, that's why uint is dangerous. You 
> exchange one additional bit of information for safety, this is wrong.

Bigger than 4GB on a 32-bit system?  Files perhaps, but I'm talking 
about memory ranges here.

> Soon enough we won't use uints the same way we don't use ushorts (I 
> should have asked if anyone uses ushort these day first, but there is so 
> little gain to use  ushort as opposed to short or int that I consider it 
> impractical). 64bit era will give us 64bit pointers and 64 bit counters. 
> Do you think you will prefer ulong over long for an additional bit? You 
> really shoudn't.

long vs. ulong for sizes is less of an issue, because we're a long way 
away from running against the limitations of a 63-bit size value.  The 
point of size_t to me, however, is that it scales automatically, so if I 
write array operations using size_t then I can be sure they will work on 
both a 32 and 64-bit system.

I do like Don's point about unsigned really meaning "unsigned" however, 
rather than "positive."  I clearly use unsigned numbers for both, even 
if I flag the "positive" uses via type alias such as size_t.  In C/C++ I 
rely on compiler warnings to trap the sort of mistakes we're talking 
about here, but I'd love a more logically sound solution if one could be 
found.

Sean