Shouldn't bool be initialized to 0xFF ?
nobody
nobody at mailinator.com
Wed Aug 16 16:16:19 PDT 2006
Lionello Lunesu wrote:
> I've been following those "why's char init'ed to -1?" / "why's float
> init'ed to NaN?" thread, and I'd have to agree with Walter: a crazy
> initialization sure makes it obvious where the problem lies.
>
> So: why isn't "bool" initialized to 0xFF too? In dmd v0.164, bool.init
> is 0, which is a valid value for bool. For byte/int/long I get it, since
> there is no invalid value for byte/int/long. But for bool there is, so
> the same reasoning as char/float applies.
>
> We could even name 0xFF "Not A Bool" ;)
>
> L.
I understand why this idea might seem appealing but I am fairly certain it is
ultimately a bad idea.
The two examples you gave are different from bool.
In the case of the reals the reason so much care is taken to assure NaNs
propogate freely is because (unlike any other basic type) the result of almost
any real valued operation which has valid arguments may not be representable as
a real number. Using NaN as a default initializer is merely an elegant side
effect. NaNs require special treatment however. In particular note:
if( NaN <= 0 ) // always fails
if( NaN >= 0 ) // also always fails
Obviously the elegance of NaN as an initializer prompted the realization that
0xFF is illegal for any ubyte value in a UTF-8 encoded string. Because any
invalid UTF-8 encoded string should be caught when it is used it is likely to
assume an invalid encoding would raise an Exception somewhere. There is a subtle
distinction between NaN and 0xFF. 0xFF by itself does not represent an entire
invalid sequence. A valid sequence can have use up to 4 bytes to encode a single
Unicode code point. Other values in various positions can also invalidate a
sequence. 0xC0, 0xC1, 0xF5 and 0xFF are all invalid in any position. In contrast
to the semantics of NaN, in particular note:
if(0xF5) // always succeeds
if(0xFF) // always succeeds
if(0xFF >= 0) // always succeeds
The biggest problem with your suggestion is the existing semantics of conditions
would mean just changing the initializer would not ordinarily indicate a problem
when using a bool value:
http://www.digitalmars.com/d/statement.html#if
/Expression/ is evaluated and must have a type that can be converted to a
boolean. If it's true the /ThenStatement/ is transferred to, else the
/ElseStatement/ is transferred to.
So the 'undefined' bool value 0xFF will be converted just like a char or ubyte
and will evaluate as true! Certainly this behavior is even worse than having a
default of false?
One might imagine it should be possible to change the way if expressions work.
Except you would also have to change the equivalent for, foreach and while
conditionals as well. The worst part is that the general case of conditional
expressions are in fact testing relations on non-bool values -- so most of the
time you will be testing to see if you should throw an Exception for bool's
'undefined' value in cases where it is not even possible.
A further problem I see comes from my (possibly wrong?) belief that bit[] has
become bool[]. If bool[32] occupies 4 bytes as I believe then in this case there
is no room for bool's 'undefined'. So then one must ask what value should be
bools default in this case?
More information about the Digitalmars-d
mailing list