The Nullity Of strings and Its Meaning
ag0aep6g via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sat Jul 8 11:39:47 PDT 2017
On 07/08/2017 07:16 PM, kdevel wrote:
> The assertion in line 6 fails. This failure gave rise to a more general
> investigation on strings. After some research I found that one
> "cannot implicitly convert expression (s) of type string to bool" as in
[...]
> Nonetheless in certain boolean contexts strings convert to bool as here:
>
> 1 void main ()
> 2 {
> 3 import std.stdio;
> 4 string s; // equivalent to s = null
> 5 writeln (s ? true : false);
> 6 s = "";
> 7 writeln (s ? true : false);
> 8 }
Yeah, that's considered "explicit". Also happens with `if (s)`.
> The code prints
>
> false
> true
>
> to the console. This lead me to the insight, that in D there are two
> distinct kinds of empty strings: Those having a ptr which is null and
> the other. It seems that this ptr nullity not only determines whether
> the string compares equal to null in an IdentityExpression [1] but also
> the result of the above mentioned conversion in the boolean context.
Yup. Though I'd say the distinction is null vs every other array, not
null vs other empty arrays.
null is one specific array. It happens to be empty, but that doesn't
really matter. `foo is null` compares with the null array. It doesn't
check for emptiness. Conversion to bool also compares with null. The
concept of emptiness is unrelated.
Maybe detecting empty arrays would be more useful. As far as I know,
there's no killer argument either way. Changing it now would break code,
of course.
Personally, I wouldn't mind if those conversions to bool just went away.
It's not obvious what exactly is being checked, and it's not hard to be
explicit about it with .ptr and/or .length. But as Timon notes, that has
been attempted, and it broke code. So it was reverted, and that's that.
> I wonder if this distinction is meaningful and---if not---why it is
> exposed to the application programmer so prominently.
"Prominently"? It only shows up when you convert to bool. You only get
surprised if you expect that to check for emptiness (or something else
entirely). And you don't really have a reason to expect that. You can
easily avoid the issue by being more explicit in your code (`arr.ptr is
null`, `arr.length == 0`/`arr.empty`).
> Then today I found this piece of code
>
> 1 void main ()
> 2 {
> 3 string s = null;
> 4 string t = "";
> 5 assert (s is t);
> 6 }
>
> which, according to the wording in [1]
>
> "For static and dynamic arrays, identity is defined as referring to
> the same array elements and the same number of elements."
>
> shall succeed but its assertion fails [2]. I anticipate the
> implementation compares the ptrs even in the case of zero elements.
The spec isn't very clear there. What does "the same array elements"
mean for empty arrays? Can two arrays refer to "the same array elements"
but have different lengths? It seems like "referring to the same array
elements" is supposed to mean "having the same value in .ptr" without
mentioning .ptr.
The implementation obviously compares .ptr and .length.
> A last example of 'deviant behavior' I found is this:
>
> 1 import std.stdio;
> 2 import std.file;
> 3 void main ()
> 4 {
> 5 string s = null;
> 6 try
> 7 mkdir (s);
> 8 catch (Exception e)
> 9 e.msg.writeln;
> 10
> 11 s = "";
> 12 try
> 13 mkdir (s);
> 14 catch (Exception e)
> 15 e.msg.writeln;
> 16 }
>
> Using DMD v2.073.2 the first expression terminates the programm with a
> segmentation fault. With 2.074.1 the program prints
>
> : Bad address
> : No such file or directory
>
> I find that a bit confusing.
That looks like a bug/oddity in mkdir. null is as valid a string as "".
It shouldn't give a worse exception message.
But the message for `""` isn't exactly good, either. Of course the
directory doesn't exist, yet; I'm trying to create it!
More information about the Digitalmars-d-learn
mailing list