Treating the abusive unsigned syndrome

Wed Nov 26 07:12:12 PST 2008

Don wrote:
> Andrei Alexandrescu wrote:
>> D pursues compatibility with C and C++ in the following manner: if a 
>> code snippet compiles in both C and D or C++ and D, then it should 
>> have the same semantics.
>>
>> A classic problem with C and C++ integer arithmetic is that any 
>> operation involving at least an unsigned integral receives 
>> automatically an unsigned type, regardless of how silly that actually 
>> is, semantically. About the only advantage of this rule is that it's 
>> simple. IMHO it only has disadvantages from then on.
>>
>> The following operations suffer from the "abusive unsigned syndrome" 
>> (u is an unsigned integral, i is a signed integral):
>>
>> (1) u + i, i + u
>> (2) u - i, i - u
>> (3) u - u
>> (4) u * i, i * u, u / i, i / u, u % i, i % u (compatibility with C 
>> requires that these all return unsigned, ouch)
>> (5) u < i, i < u, u <= i etc. (all ordering comparisons)
>> (6) -u
> 
> I think that most of these problems are caused by C enforcing a foolish 
> consitency between literals and variables.
> The idea that literals like '0' and '1' are of type int is absurd, and 
> has caused a torrent of problems. '0' is just '0'.
> 
> uint a = 1;
> does NOT contain an 'implicit conversion from int to uint', any more 
> than there are implicit conversions from naturals to integers in 
> mathematics. So I really like the polysemous types idea.

Yah, polysemy will take care of the constants. It's also rather easy to 
implement for them.

> For example, when is it reasonable to use -u?
> It's useful with literals like
> uint a = -1u; which is equivalent to uint a = 0xFFFF_FFFF.
> Anywhere else, it's probably a bug.

Maybe not even for constants as all uses of -u can be easily converted 
in ~u + 1. I'd gladly agree to disallow -u entirely.

> My suspicion is, that if you allowed all signed-unsigned operations when 
> at least one was a literal, and made everything else illegal, you'd fix 
> most of the problems. In particular, there'd be a big reduction in 
> people abusing 'uint' as a primitive range-limited int.

Well, part of my attempt is to transform that abuse into legit use. In 
other words, I do want to allow people to consider uint a reasonable 
model of natural numbers. It can't be perfect, but I believe we can make 
it reasonable.

Notice that the fact that one operand is a literal does not solve all of 
the problems I mentioned. There is for example no progress in typing u1 
- u2 appropriately.

> Although it would be nice to have a type which was range-limited, 'uint' 
> doesn't do it. Instead, it guarantees the number is between 0 and 
> int.max*2+1 inclusive. Allowing mixed operations encourages programmers 
> to focus the benefit of 'the lower bound is zero!' while forgetting that 
> there is an enormous downside ('I'm saying that this could be larger 
> than int.max!')

I'm not sure I understand this part. To me, the larger problem is 
underflow, e.g. when subtracting two small uints results in a large uint.

> Interestingly, none of these problems exist in assembly language 
> programming, where every arithmetic instruction affects the overflow 
> flag (for signed operations) as well as the carry flag (for unsigned).

They do exist. You need to use imul/idiv vs. mul/div depending on what 
signedness your operators have.

Andrei