Checking if a string is null

Regan Heath regan at netmail.co.nz
Wed Jul 25 06:00:27 PDT 2007


>> So, "" is < and == null!?
>> and <=,== but not >=!?
>>
> 
> You didn't update all writefln's :)

<hangs head in shame> What can I say, I'm having a bad morning.

> Anyway, it feels like an undefined area in the language. Do the specs
> say anything about how exactly arrays/strings/delegates should compare
> to null? It seems to be more than comparing the pointer part of the
> structs.

Not that I can find.  The array page does say:

"Strings can be copied, compared, concatenated, and appended:"
..
"with the obvious semantics."

but not much more on the topic.  Under "Array Initialization" we see:

     * Pointers are initialized to null.
     ..
     * Dynamic arrays are initialized to having 0 elements.
     ..

Which does not state that an array will be initialised to "null" but 
rather to something with 0 elements.

To my mind something with 0 elements is 'empty' as opposed to being 'non 
existant' which is typically represented by 'null' or a similar value 
(like NAN for floats, 0xFF for char, etc).

So, it seems the spec is hinting/saying that arrays cannot be 
non-existant, only empty (or not empty).

And yet in the current implementation there is clearly a difference 
between 'null' and "" when it comes to arrays.

I'm still firmly in favour of there being 3 distinct states for an array:
  * non existant (null)
  * empty        ("", length == 0)
  * not empty    (length > 0)

That said I'm all firmly in favour of not getting a seg-fault when I 
have a reference to a non-existant array (we currently have this 
behaviour and it's perfect).

All I think that needs 'fixing', and going back to your initial test case:

char[] s = "";

if (s is null) writefln("s is null");
if (s == null) writefln("s == null");		

neither of these tests should evaluate 'true'.

The fact that the latter does indicates to me that the array compare is 
first comparing length, seeing they're both 0 and assuming the arrays 
must be equal.

I think instead it should also check the data pointer because in the 
case of "" the data pointer is non-null.  The same is true for a zero 
length slice i.e. s[0..0], it exists (data pointer is non-null) but is 
empty (length is zero).

In short, the compare function should recognise the 3 states:
  * non existant (data pointer is null)
  * empty        (data pointer is non-null, length is zero)
  * not empty    (length is > zero)

and never make the mistake of calling an array in one state equal to an 
array in another state.

Regan

p.s. I am cross-posting and setting followup to digitalmars.D as it has 
become more of a theory/discussion on D than a learning exercise :)

p.p.s Plus, I figure if Manfred cannot recall a discussion on this topic 
we probably need another one about now.


More information about the Digitalmars-d-learn mailing list