disabling unary "-" for unsigned types

Steven Schveighoffer schveiguy at yahoo.com
Thu Feb 18 04:36:30 PST 2010


On Tue, 16 Feb 2010 19:33:11 -0500, Walter Bright  
<newshound1 at digitalmars.com> wrote:

> Steven Schveighoffer wrote:
>> We're not working in Assembly here.  This is a high level language,  
>> designed to hide the complexities of the underlying processor.  The  
>> processor has no idea whether the data in its registers is signed or  
>> unsigned.  The high level language does.  Please use that knowledge to  
>> prevent stupid mistakes, or is that not one of the goals of the  
>> compiler?  I can't believe this is such a hard point to get across.
>
> It's not that I don't understand your point. I do, I just don't agree  
> with it. At this point, we are going in circles so I don't think there's  
> much value in me reiterating my opinions on it, except to say that  
> Andrei and I once spent a great deal of time trying to separate signed  
> from unsigned using the type system. The problem was that expressions  
> tend to legitimately mix up signed and unsigned types together. Trying  
> to tease out the "correct" sign of the result and what the programmer  
> might have intended turned out to be an inscrutable mess of complication  
> that we finally concluded would never work. It's a seductive idea, it  
> just doesn't work. That's why C, etc. allows for easy implicit  
> conversions between signed and unsigned, and why it has a set of  
> (indeed, arbitrary) rules for combining them. Even though arbitrary, at  
> least they are understandable and consistent.

As long as you clarify that you understand I'm not talking about the  
entire field of 2s complement math, but only the small case of negating an  
unsigned and assigning it to an unsigned, I will drop the argument.   
Because it is not clear from your arguments that you understand that.

>
> Back when ANSI C was first finalized, there was a raging debate for  
> years about whether C should use value-preserving or sign-preserving  
> integral promotion rules. There were passionate arguments on both sides,  
> both of which claimed the territory of intuitiveness and obviousness.
>
> The end result was both sides eventually realized there was no correct  
> answer, and that an arbitrary decision was required. It was made (value  
> preserving), and half of the compiler vendors changed their compilers to  
> match, and the rancor was forgotten.

D has made huge strides in creating designs that remove whole classes of  
errors from equivalent C code, just by making certain constructs illegal.   
I don't think that citing the past failures of C is a good way to argue  
why D can't do it.

I'm thankful every time the compiler fails to compile code like

if(x == 5);
    doThis();

>
> For example, let's take two indices into an array, i and j:
>
>      size_t i,j;
>
> size_t is, by convention, unsigned.
>
> Now, to get the distance between two indices:
>
>     auto delta = i - j;
>
> By C convention, delta is unsigned. If i is >= j, which may be an  
> invariant of my algorithm, all is well. If i < j, suddenly delta is a  
> very large value (but it still works, because of wrap around). The point  
> is, there is no correct rule for dealing with the types of i-j. This has  
> consequences:
>
> Now, if j happens instead to be a complicated loop invariant expression  
> (e) in a loop,
>
>      loop
> 	auto delta = i - (e);
>
> we may instead opt to hoist it out of a loop:
>
>      auto j = -(e);
>      loop
>            auto delta = i + j;
>
> and suddenly the compiler spits out error messages? Why can I subtract  
> an unsigned, but not negate one? Such rules are complicated and will  
> seem arbitrary to the user.

First, it would work under my rules.  j would be of type int.  Under my  
rules, negating an unsigned value equates to a signed version of that  
type.  It is the only operator which does so because it's more accurate  
50% of the time over an unsigned type (the other 50% it becomes the same  
value, so either is just as accurate).  In all other cases of operators on  
unsigned types, the result should be unsigned because there's no more  
accurate answer, and most of the time you wish to remain in the same  
type.  To re-iterate, negation of a positive value (as defined by the  
type) implies the user wants a negative value.

Second, I think it's way more clear to do:

auto j = (e);
loop
      auto delta = i - e;

Just looking at this one line throws up red flags for me:

auto delta = i + e; // huh?  Isn't a delta a difference?

The only other rule I proposed is that assigning a negated unsigned to an  
unsigned (or passing it to a function that requires unsigned) would be  
illegal to prevent obvious mistakes.

>>>> The case I'm talking about is the equivalent to doing:
>>>>  x = x / 0;
>>>
>>> Even mathematicians don't know what to do about divide by zero. But  
>>> 2's complement arithmetic is well defined. So the situations are not  
>>> comparable.
>>  Sure they do, the result is infinity.  It's well defined.
>
> I'm not a mathematician, but I believe it is not well defined, which one  
> finds out when doing branch cuts. Per IEEE 754 (and required by D),  
> floating point arithmetic divide by 0 resolves to infinity, but not all  
> FPU hardware conforms to this spec. There is no similar convention for  
> integer divide by 0. This is why the C standard leaves this as  
> "implementation defined" behavior.

I'm not a mathematician either, but I remember in school learning about  
working with infinity, and varying degrees of infinity.  For example, if  
series 1 approaches infinity twice as fast as series 2, then you could say  
series 1 divided by series 2 is equal to 2.  It was some interesting  
stuff, and I think dealing with infinity is mostly theoretical.  Applying  
this to computer programming is somewhat specialized, but my point was  
just as a comparison to something else that is mostly always an error.   
There are several other constructs I could use, interestingly enough, most  
are dealing with one specific literals, where negation of an unsigned  
results in an error for over 2 billion values.

-Steve



More information about the Digitalmars-d mailing list