Checking if a string is null

Regan Heath regan at netmail.co.nz
Wed Jul 25 06:29:47 PDT 2007


>> I'm in the 'distinguishable' camp.  I can see the merit.  At the very 
>> least it should be consistent!
> 
> They *are* distinguishable. That's why above code returns different 
> results for the 'is' comparison...

True.  I guess what I meant to say was I'm in the '3 distict states' 
camp (which may be a camp of 1 for all I know).  See my reply to 
digitalmars.D for a definition of the 3 states.

> I for one am perfectly fine with "cast(char[]) null" meaning ".length == 
> 0 && .ptr == null" 

Same here.

 > and with comparisons of arrays using == and friends
> only inspecting the contents (not location) of the data.

I don't think an empty string (non-null, length == 0) should compare 
equal to a non-existant string (null, length == 0).  And vice-versa.

The only thing that should compare equal to null is null.  Likewise an 
empty array should only compare equal to another empty array.

My reasoning for this is consistency, see at end.

Aside: If the location and length are identical you can short-circuit 
the compare, returning true and ignoring the content, this could save a 
bit of time on comparisons of large arrays.

> Now, about comparisons: array comparisons basically operate like this:
> ---
> int opEquals(T)(T[] u, T[] v) {              // bah to int return type
>     if (u.length != v.length) return false;
>     for (size_t i = 0; i < u.length; i++) {
>         if (u[i] != v[i]) return false;
>     }
>     return true;
> }
> 
> int opCmp(T)(T[] u, T[] v) {
>     size_t len = min(u.length, v.length)
>     for (size_t i = 0; i < len; i++) {
>         if (auto diff = u[i].opCmp(v[i])) {
>             return diff;
>         }
>     }
>     return cast(int)u.length - cast(int)v.length;
> }
> ---
> (Taken from object.TypeInfo_Array and converted to templates instead of 
> void*s + casting + element TypeInfo.{equals/compare} for readability)

Thanks.

> Since both the null string and "" have .length == 0, that means they 
> compare equal using those methods (having no contents to compare and 
> equal length)

This is the bit I don't like.

> This is all perfectly consistent (and even useful) to me...

It's not consistent with other reference types, types which can 
represent 'non-existant', eg.

   char *p = null;  //non-existant

   if (p == null) writefln("p == null");
   if (p == "") writefln("p == \"\"");

Output:
   p == null

Compare that to:

   char[] p = null;

   if (p == null) writefln("p == null");
   if (p == "") writefln("p == \"\"");

Output:
   p == null
   p == ""

All that I would like changed is for the compare, in the case of length 
== 0, to check the data pointers, eg.

 > int opEquals(T)(T[] u, T[] v) {
 >     if (u.length != v.length) return false;
       if (u.length == 0) return (u.ptr == v.ptr);
 >     for (size_t i = 0; i < u.length; i++) {
 >         if (u[i] != v[i]) return false;
 >     }
 >     return true;
 > }

This should mean "" == "" but not "" == null, likewise null == null but 
not null == "".

Regan


More information about the Digitalmars-d-learn mailing list