On C/C++ undefined behaviours (on the term "undefined behaviours")

Bruno Medeiros brunodomedeiros+spam at com.gmail
Wed Oct 6 08:34:15 PDT 2010


On 20/08/2010 17:38, bearophile wrote:
> Three good blog posts about undefined behaviour in C and C++:
> http://blog.regehr.org/archives/213
> http://blog.regehr.org/archives/226
> http://blog.regehr.org/archives/232
>
> In those posts (and elsewhere) the expert author gives several good bites to the ass of most compiler writers.
>
> Among other things in those three posts he talks about two programs as:
>
> import std.c.stdio: printf;
> void main() {
>      printf("%d\n", -int.min);
> }
>
> import std.stdio: writeln;
> void main() {
>      enum int N = (1L).sizeof * 8;
>      auto max = (1L<<  (N - 1)) - 1;
>      writeln(max);
> }
>
> I believe that D can't be considered a step forward in system language programming until it gives a much more serious consideration for integer-related overflows (and integer-related undefined behaviour).
>
> The good thing is that Java is a living example that even if you remove most integer-related undefined behaviours your Java code is still able to run as fast as C and sometimes faster (on normal desktops).
>
> Bye,
> bearophile

Interesting post.

There is a important related issue here. It should be noted that, even 
though the article and the C FAQ say:
"
The C FAQ defines “undefined behavior” like this:

     Anything at all can happen; the Standard imposes no requirements. 
The program may fail to compile, or it may execute incorrectly (either 
crashing or silently generating incorrect results), or it may 
fortuitously do exactly what the programmer intended.
"
this definition of "undefined behavior" is not used consistently by C 
programmers, or even by more official sources such as books, or even the 
C standards. A trivial example:

   foo(printf("Hello"), printf("World"));

Since the evaluation order of arguments in not defined in C, these two 
printfs can be executed in any of the two possible orders. The behavior 
is not specified, it is up to the implementation, to the compiler 
switches, etc..
Many C programmers would say that such code has/is/produces undefined 
behavior, however, that is clearly not “undefined behavior” as per the 
definition above. A correct compiler cannot cause the code above to 
execute incorrectly, crash, calculate PI, format you hard disk, 
whatever, like on the other cases. It has to do everything it is 
supposed to do, and the only "undefined" thing is the order of 
evaluation, but the code is not "invalid".

I don't like this term "undefined behavior". It is an unfortunate C 
legacy that leads to unnecessary confusion and misunderstanding, not 
just in conversation, but often in coding as well. It would not be so 
bad if the programmers had the distinction clear at least in their 
minds, or in the context of their discussion. But that is often not the 
case.

I've called before for this term to be avoided in D vocabulary, mainly 
because Walter often (ab)used the term as per the usual C legacy.
The “undefined behavior” as per the C FAQ should be called something 
else, like "invalid behavior". Code that when given valid inputs causes 
invalid behavior should be called invalid code.
(BTW, this maps directly to the concept of contract violations.)


-- 
Bruno Medeiros - Software Engineer


More information about the Digitalmars-d mailing list