Different types of boolean (Was: Re: DMD 0.148 release)

Mon Feb 27 07:34:25 PST 2006

Georg Wrede skrev:
> Tom wrote:
>> Georg Wrede says...
>>> Ivan Senji wrote:
>>>> Georg Wrede wrote:
>>>>> Ivan Senji wrote:
>>>>>> Georg Wrede wrote:
>>>>>>> Derek Parnell wrote:
>>>>>>>> Georg Wrede wrote:

[snip]

> In this post alone, there's a half dozen references to "my kind of bool" 
> or "your kind of bool", by various people.
> 
> Probably somebody should start making a table of their properties! :-)

Lets try. First the two extremes:

Numerical bool
-------------
True is 1, false is 0. Math is computed in a modulo-2 algebra (the 
result of all operations is computed mod (%) 2).
This bool is an integer type, and should also be implicitly castable to 
int. Analogous to the unsigned integer types:

bool:   mod 2 algebra
ubyte:  mod 256 algebra
ushort: mod 65536 algebra
uint:   mod 4294967296 algebra
ulong:  mod 18446744073709551616 algebra

Boolean operators (&,|,^,~,...) works as expected and are implementable 
efficiently. Arithmetic operators have the following properties:

+ behaves like xor (1+1 = 0)
* behaves like and

This type can be implemented efficiently as an (1, 2, 4, ...)-byte 
integer by either a) always mask the result of any operation cast to a 
bool by by &1 or b) consider all bits but the least significant to be 
undefined (meaning mask &1 when cast to a longer type)

Downside: Even integers (2,4,...) would be cast into a boolean 0 (false).

This is the pure numeric bool type. CPU-friendly.

Logical bool
------------
True and false are the only allowed values of this boolean variable. The 
internal representation is irrelevant.

Boolean operators behaves as expected. Arithmetic operators should 
either be:

a) forbidden, or
b) defined as in boolean algebra (* == and, + == or, ~ == not).

This bool should not be implicitly convertible to anything and should 
not be allowed to take place in arithmetics with other types.

Is this less CPU-friendly than a numerical bool? See comments below.

C++ bool, C99 _Bool
-------------------
C++ and C99 defines its boolean somewhere in between the Numerical and 
the Logical boolean (i.e. gets none of them right). The bool type is 
defined to be a numeric type implicitly castable to int (and therefore 
promotes to int/uint). Casting anything (x) to a bool yields 1 unless x 
== 0. This makes this bool-type quite different from a pure numeric 
type:  (Consider x a boolean variable.)

There are no overflows:
++x, x++, x += 1 all sets x to 1.
But there are underflows:
--x, x--, x -= 1 flips the value of x.

Even though much caution is taken making sure a bool only ever has the 
value 1 or 0, the integer promotion breaks everything:

bool a = true;
bool b = a+a;
assert(b == true); // OK
assert(a+a == true); // FAILS
assert(a+true == true); // FAILS
assert(true+true == true); // FAILS

Hence the rule: Never compare anything to true. (Or never use arithmetic 
operations on a bool.)

Java boolean
------------
The Java boolean follows the definition of Logical bool above, and 
disables arithmetic operators. Bools are never promoted to ints. Bools 
can not even be cast to ints.

There are no caveats using the Java boolean type, but you occasionally 
need to write a bit more code to make the compiler happy.

D bool
------
I'm not fully aware how bools now are defined in D. The basic idea seems 
to be the same as C99/C++, but with one small difference:
- Can not implicitly convert integer literals other than 0 and 1 to bool.

This difference feels slightly inconsistent because you can implicitly 
convert a non literal integer to a bool.

(There is also another difference that is likely a bug (*) )

Can anyone please fill in what other differences there are?

Comments
--------
Integer promotion makes "== true" a trap just waiting to be sprung for a 
numerical bool. The advantages of being able to do arithmetics on a bool 
are cases such as:

bool addPadding = something();
int pos = 10 + addPadding * 5;

Compared to:
int pos = 10 + addPadding ? 5 : 0;

Could help a compiler avoid a branching instruction... But you get the 
branching when forcing the bool to {0,1} instead.

Are there any other advantages?

Could a logical bool not be defined as being true when !0 and false 
otherwise, without the need to clamp it to {0,1}? Would this not be more 
efficient than the current bool? This would mean that the argument 
against opEquals returning a bool would vanish for instance. The only 
thing that would be less efficient would be comparing two boolean 
variables with each other (a rare occurrence) and that would not be less 
efficient than today unless there are many more comparisons than 
assignments.

With dead horses in mind, my suggestion would be to make the D bool a 
_logical_ bool that is implicitly convertible _from_ any type that can 
be used as a condition expression (if(...)) but not _to_ any other type. 
Internal representation is irrelevant (could even be an ABI-thing).

/Oskar

---

*) The other difference in D bool that is likely a bug:
- Like C99/C++, D makes sure that anything (x) cast to a bool yields 1 
unless x == 0, but differs when arithmetics are done on the bool type 
itself:

bool a = 0;
--a;
assert(a == false || a == true); // FAILS(!) (a is set to 255)

bool t = true;
bool b = t+t;
assert(b == true); //OK
bool c = 0;
c++;
c++;
assert(c == true || c == false); // FAILS

But:

bool d = 0;
d += 2;
assert(d == true || d == false); // OK

The last case works due to promotion bool -> int -> bool but the failing 
cases do not take that route.

(being consistent, the definitions of post/predec of a bool should be to 
flip its value and post/preinc should set it to true)