Null references (oh no, not again!)

Wed Mar 4 05:03:03 PST 2009

Walter Bright:

Thank you for your answer.
I think you have programmed plenty in Pascal-like languages that support arithmetic overflow checks, so all/some of the things I write below may sound obvious to you.
I base what I say on this on many years of experience of programming in languages that allow me to switch on such arithmetic overflow checks, so your words aren't going to change my mind. Seeing how "new" languages like C#, Clojure, Python and so on, have ways to avoid the arithmetic bugs I was talking about, I think I'm not alone.

>There is a SafeInt class built for C++. It should be quite doable for D without needing any particular language support. That kind of thing is precisely what operator overloading is for. The nice thing about it is anyone can write and use such a class - no need to convince anyone else of its merits.<

You have to modify the code in many places to use it.

I'm sure there are Safe-something classes for array bounds too, but they can't avoid most of the out-of-bound errors because very few people use it everywhere in programs, programmers are lazy.

So I think a SafeInt class isn't much useful. Most or all C++ programs I see around don't use it.
Google code search lists 347 usages of the word 'SafeInt' in C++ code:
http://www.google.com/codesearch?hl=en&sa=N&q=SafeInt++lang:c%2B%2B&ct=rr&cs_r=lang:c%2B%2B

>Or you could change the compiler to throw an exception on any integer arithmetic overflow. Sounds great, right? Consider that there's no hardware support for this, so the following would have to happen:<
>This is going to slow things down and bloat up the code generation. But wait, it gets worse. The x86 has a lot of complex addressing modes that are used for fast addition, such as:<
>None of these optimizations could be used if checking is desired.<

LLVM will have intrinsics to support such things. LDC may use them with a small amount of extra code in the front-end.

Some of those checks can be avoided, because sometimes you can infer the operation can't overflow.

I have turned on such checks hundred of times in Pascal, TurboPascal, Delphi, and FreePascal programs, and it has allowed me to spot bugs that are far worse than some slowdown during debugging. I have written many times prototypes of programs in Python because it avoids such overflow bugs.

I like D also because it allows me to write fast programs, but for most programs most of the code isn't performance-critical, so lot of code isn't so damaged by such checks. That's why a large percentage of programs can today be written in managed languages or scripting languages or that are slower or way slower than good C/C++/D programs.

Note that such code bloat and slowdown can be limited to debug time only too, disabling such checks locally or globally in release versions, if the programmer wants so.

Regarding the slowdown, people can take an average C# program and compile it with and without such overflow checks, and time the performance difference. (Notice that C# is not a low-performance language, its associative arrays are far faster than the current D ones, its GC is way more refined, it adapts itself to 32 and 64 bit CPUs, it's able to use multicores in easy ways, see parallel LINQ, etc). If you want I can perform some benchmarks later.

>So, to keep the performance, you'll have to be able to select which one you want, either by a separate parallel set of integer types (doubling the number of types),<

A parallel set of integers doesn't solve the problem, it's just a way to make the situation more complex and messy. (The compiler switch can switch off arithmetic overflow checks for such second set of integral numbers, but it sounds strange and not nice).

>or by having special code blocks, such as:
     checked  // this is what C# does
     {
             x = a + b;
     }
I just don't see that being very popular. Code is full of arithmetic, and adding checked all over the place will not only uglify the code, chances are nearly certain that it will get omitted here and there for operations that might overflow.<

Most times you want to switch on and off such checks for the whole program, and maybe to switch them off for high-performance modules or for some functions. This is quick&easy to do and you don't risk omitting some operations.

In past posts I too have proposed a local syntax like the following, that is less important than more global switches:

safe(overflow, ...) {
   ...
}

(Like a "static if" doesn't create a new scope).

Bye,
bearophile