Mixing operations with signed and unsigned types

Mon Jul 5 17:59:04 PDT 2010

Stewart Gordon:

> having language features and APIs relying on signed types
> for semantically unsigned values is not just a way of not forcing you to
> use unsigned types - it's also potentially a way of forcing you not to
> use them, or to pepper your code with casts if you do.  I guess you just
> can't please everybody.

It's better to limit the usage of unsigned values in APIs too. While you can't change the API of existing C libs (that can use unsigned values too) we can create an ecosystem of D modules that use unsigned values only when they are necessary (this means only in uncommon cases). This allows you to use few signed->unsigned casts in your code (C# libs are designed like this).

> Not quite - an unsigned type has twice the range.  It's true that this
> extra range isn't always used,

This happens and it's one of the usage examples of unsigned values, but this usage case requires some conditions:
- The max value of the signed value for example 127, 32767, 2147483647 or 9223372036854775807, is not big enough, this happens.
- The range of the unsigned is surely big enough in your code. This happens, but sometimes it also happens that what overflows a signed value also overflows the unsigned one, it's just one bit more.
- You can't use a bigger value (and cent/ucent are not available yet in D). This is less common. Some APIs give you a value of a certain size/type and you can't change it, but in many other situations you can use for example a long where a int isn't enough. There are other situations where this is bad (for example you have a large array of such values in a performance-critical spot of your program, so using long instead of uint doubles the array size and increases the cache pressure), or you really have a compute-bound spot in your program, where doing lot of operations on  unsigned values give you better performance compared to using longs, but this is not a so common situation. There are many situations where using an uint instead of a long is premature optimization :-)

> but in some apps/APIs there may be bits
> that use the extra range and bits that don't, and it is often simpler to
> use unsigned everywhere it's logical than to expect the user to remember
> which is which.

The D language has to give you the tools to use messy APIs too, but it's better to teach D programmers to create less messy D APIs, allowing usage of only or mostly signed values, etc. The array lengths and indexes are a good spot to start improving the APIs of the language.

> Trouble is that it would add a lot of code to every integer arithmetic
> operation.

This is a quantitative discussion, in theory this feature can be implemented and then we can measure how many bytes are added to the binaries of a certain number of interesting benchmark programs. My experience with Delphi and C# shows that for me this cost is tolerable (I have tried it in small and medium size programs, for years), both in compile time, run time and binary size (compile time is about the same. Run-time performance is worsened usually less than the array bound tests done by D!).

I have filed some related bugs for LLVM:
http://llvm.org/bugs/show_bug.cgi?id=4916
http://llvm.org/bugs/show_bug.cgi?id=4917
http://llvm.org/bugs/show_bug.cgi?id=4918
Those are enhancement proposals that ask to the LLVM backend to produce optimal asm when it is fed with C code that tests for specific overflows. They show that such overflow tests can add several instructions for each overflow test if they are done through normal C/D code.
But that overhead can be reduced to about 3 instructions (one of them is a jump that usually is not taken, so the code execution goes forward straight, so on modern CPUs this jump costs very little) if the compiler implements them more directly, from some little 'templates' written by a human (and in several cases the compiler can omit such tests, for example for loop variables, simple operations with enums, etc, reducing both code size and performance loss). To test overflows of bytes/shorts/ubytes/ushorts it's needed a little more code, because the CPU flags don't help you much on this.
And in my programs often operations are done among floating point values, that have no overflow tests, so they incur in no speed loss or code size increase.

> Of course, it could be omitted in release builds, but
> arithmetic is so frequent an activity that the extent it would slow down
> and bloat development builds would be annoying.

The overflow tests I am talking about are optional, if you don't want them you can disable them even in development builds. If you don't want them you don't have to pay for them and you can ignore them. They don't even slow down compilation (unless by design during compilation they are always switched on to watch for overflows among the values known compile-time).

> But D isn't designed to be fully source-compatible with C, hence the
> suggestion of making it illegal.

>From what Andrei and Walter have said this will not happen (maybe because the language and Phobos force to use too many unsigned values, and we are back to my original idea), so different solutions are needed.

-----------------

Some numbers, using GCC 4.5 and FreePascal 2.4 (fp), on Windows Vista.

Key:
  C = C code stripped and max optimized.
  fp = FreePascal code max optimized.
  fp+r = FreePascal code max optimized + range tests + overflow tests.

Benchmarks:

nbody:
   binary size, bytes:
      C:     11_776
      fp:   127_336
      fp+r: 127_464
   runtime, N=1_000_000, seconds:
      C:    0.56
      fp:   0.65
      fp+r: 0.66

old fannkuch:
   binary size, bytes:
      C:    12_288
      fp:   66_011
      fp+r: 66_651
   runtime, N=11, seconds:
      C:    4.97
      fp:   5.14
      fp+r: 9.54

old mandelbrot:
   binary size, bytes:
      C:    11_264
      fp:   66_661
      fp+r: 66_723
   runtime, size=3000, seconds:
      C:      1.88
      fp:    10.35
      fp+r:  11.89

fasta:
   binary size, bytes:
      C:    13_312
      fp:   73_748
      fp+r: 74_658
   runtime, N=5_000_000, seconds:
      C:     2.09
      fp:    2.06
      fp+r:  2.06

recursive:
   binary size, bytes:
      C:    18_944
      fp:   72_980
      fp+r: 73_042
   runtime, N=5_000_000, seconds:
      C:     4.04
      fp:   11.88
      fp+r: 11.90

nbody is heavy FP. fannkuch is mostly about small array of integers with some integer operations. mandelbrot is FP-heavy but contains bit-twiddling too. fasta contains arrays and integer operations. recursive contains both FP and integer-based operations.

Those are small programs, both the size and performance doesn't change a lot. But better benchmarks are needed.

Bye,
bearophile