Treating the abusive unsigned syndrome

bearophile bearophileHUGS at lycos.com
Tue Nov 25 08:38:02 PST 2008


Few general comments.

Andrei Alexandrescu:

> D pursues compatibility with C and C++ in the following manner: if a 
> code snippet compiles in both C and D or C++ and D, then it should have 
> the same semantics.

I didn't know of such "support" for C++ syntax too, isn't such "support" for C syntax only? D has very little to share with C++.

This rule is good because you can take a piece of C code and convert it to D with less work and fewer surprises. I have already translated large pieces of C code to D, so I appreciate this.

But in several things C syntax and semantics is too much error prone or "wrong", so sometimes it can also become a significant disadvantage for a language like D that tries to be much less error-prone than C.

One solution is to "disable" some of the more error-prone syntax allowed in C, turning it into a compilation error. For example I have seen newbies write bugs caused by leaving & where a && was necessary. In such case just adopting "and" and making "&&" a syntax error solves the problem and doesn't lead to bugs when you convert C code to D (you just use a search&replace, replacing && with and on the code).

In other situations it may be less easy to find such kind of solutions (that is invent an alternative syntax/semantics and make the C one a syntax error), in such cases I think it's better to discuss each one of such situations independently. In some situations we can even break the standard way D pursues compatibility, for the sake of avoiding bugs and making the semantics better.


> The disadvantage is that it is more complex

It's not really more complex, it just makes visible some hidden complexity that is already present and inherent of the signed/unsigned nature of the numbers.
It also follows the Python Zen rule: "In the face of ambiguity, refuse the temptation to guess."


> and may surprise the novice 
> in its own way by refusing to compile code that looks legit.

A compile error is better than a potential runtime bug.


> Walter, as many good long-time C programmers, knows the abusive 
> unsigned rule so well he's not hurt by it and consequently has little 
> incentive to see it as a problem.

I'm not a newbie of programming, but in the last year I have put in my code two bugs related to this, so I suggest to find ways to avoid this silly situation. I think the first bug was something like:
if (arr.lenght > x) ...
where x was a signed int with value -5 (this specific bug can also be solved making array length a signed value. What's the point of making it unsigned in the first place? I have seen that in D it's safer to use signed values everywhere you don't strictly need an unsigned value. And that length doesn't need to be unsigned).

Beside the unsigned/signed problems discussed here, it may be positive to list some of other situations where the C syntax/semantics may lead to bugs. For example, does fixes the C semantics of % (modulo) operation?
Another example: in both Pascal and Python3 there are two different operators for the division, one for the FP one and one for the integer one (in Pascal they are / and div, in Python3 they are / and // ).. So can it be positive for D too to define two different operators for such purpose? 

Bye,
bearophile



More information about the Digitalmars-d mailing list