Different types of boolean (Was: Re: DMD 0.148 release)
Oskar Linde
oskar.lindeREM at OVEgmail.com
Mon Feb 27 07:34:25 PST 2006
Georg Wrede skrev:
> Tom wrote:
>> Georg Wrede says...
>>> Ivan Senji wrote:
>>>> Georg Wrede wrote:
>>>>> Ivan Senji wrote:
>>>>>> Georg Wrede wrote:
>>>>>>> Derek Parnell wrote:
>>>>>>>> Georg Wrede wrote:
[snip]
> In this post alone, there's a half dozen references to "my kind of bool"
> or "your kind of bool", by various people.
>
> Probably somebody should start making a table of their properties! :-)
Lets try. First the two extremes:
Numerical bool
-------------
True is 1, false is 0. Math is computed in a modulo-2 algebra (the
result of all operations is computed mod (%) 2).
This bool is an integer type, and should also be implicitly castable to
int. Analogous to the unsigned integer types:
bool: mod 2 algebra
ubyte: mod 256 algebra
ushort: mod 65536 algebra
uint: mod 4294967296 algebra
ulong: mod 18446744073709551616 algebra
Boolean operators (&,|,^,~,...) works as expected and are implementable
efficiently. Arithmetic operators have the following properties:
+ behaves like xor (1+1 = 0)
* behaves like and
This type can be implemented efficiently as an (1, 2, 4, ...)-byte
integer by either a) always mask the result of any operation cast to a
bool by by &1 or b) consider all bits but the least significant to be
undefined (meaning mask &1 when cast to a longer type)
Downside: Even integers (2,4,...) would be cast into a boolean 0 (false).
This is the pure numeric bool type. CPU-friendly.
Logical bool
------------
True and false are the only allowed values of this boolean variable. The
internal representation is irrelevant.
Boolean operators behaves as expected. Arithmetic operators should
either be:
a) forbidden, or
b) defined as in boolean algebra (* == and, + == or, ~ == not).
This bool should not be implicitly convertible to anything and should
not be allowed to take place in arithmetics with other types.
Is this less CPU-friendly than a numerical bool? See comments below.
C++ bool, C99 _Bool
-------------------
C++ and C99 defines its boolean somewhere in between the Numerical and
the Logical boolean (i.e. gets none of them right). The bool type is
defined to be a numeric type implicitly castable to int (and therefore
promotes to int/uint). Casting anything (x) to a bool yields 1 unless x
== 0. This makes this bool-type quite different from a pure numeric
type: (Consider x a boolean variable.)
There are no overflows:
++x, x++, x += 1 all sets x to 1.
But there are underflows:
--x, x--, x -= 1 flips the value of x.
Even though much caution is taken making sure a bool only ever has the
value 1 or 0, the integer promotion breaks everything:
bool a = true;
bool b = a+a;
assert(b == true); // OK
assert(a+a == true); // FAILS
assert(a+true == true); // FAILS
assert(true+true == true); // FAILS
Hence the rule: Never compare anything to true. (Or never use arithmetic
operations on a bool.)
Java boolean
------------
The Java boolean follows the definition of Logical bool above, and
disables arithmetic operators. Bools are never promoted to ints. Bools
can not even be cast to ints.
There are no caveats using the Java boolean type, but you occasionally
need to write a bit more code to make the compiler happy.
D bool
------
I'm not fully aware how bools now are defined in D. The basic idea seems
to be the same as C99/C++, but with one small difference:
- Can not implicitly convert integer literals other than 0 and 1 to bool.
This difference feels slightly inconsistent because you can implicitly
convert a non literal integer to a bool.
(There is also another difference that is likely a bug (*) )
Can anyone please fill in what other differences there are?
Comments
--------
Integer promotion makes "== true" a trap just waiting to be sprung for a
numerical bool. The advantages of being able to do arithmetics on a bool
are cases such as:
bool addPadding = something();
int pos = 10 + addPadding * 5;
Compared to:
int pos = 10 + addPadding ? 5 : 0;
Could help a compiler avoid a branching instruction... But you get the
branching when forcing the bool to {0,1} instead.
Are there any other advantages?
Could a logical bool not be defined as being true when !0 and false
otherwise, without the need to clamp it to {0,1}? Would this not be more
efficient than the current bool? This would mean that the argument
against opEquals returning a bool would vanish for instance. The only
thing that would be less efficient would be comparing two boolean
variables with each other (a rare occurrence) and that would not be less
efficient than today unless there are many more comparisons than
assignments.
With dead horses in mind, my suggestion would be to make the D bool a
_logical_ bool that is implicitly convertible _from_ any type that can
be used as a condition expression (if(...)) but not _to_ any other type.
Internal representation is irrelevant (could even be an ABI-thing).
/Oskar
---
*) The other difference in D bool that is likely a bug:
- Like C99/C++, D makes sure that anything (x) cast to a bool yields 1
unless x == 0, but differs when arithmetics are done on the bool type
itself:
bool a = 0;
--a;
assert(a == false || a == true); // FAILS(!) (a is set to 255)
bool t = true;
bool b = t+t;
assert(b == true); //OK
bool c = 0;
c++;
c++;
assert(c == true || c == false); // FAILS
But:
bool d = 0;
d += 2;
assert(d == true || d == false); // OK
The last case works due to promotion bool -> int -> bool but the failing
cases do not take that route.
(being consistent, the definitions of post/predec of a bool should be to
flip its value and post/preinc should set it to true)
More information about the Digitalmars-d-announce
mailing list