null Vs [] return arrays

Tue Apr 5 10:46:06 PDT 2011

On Tue, 05 Apr 2011 13:24:49 -0400, Regan Heath <regan at netmail.co.nz>  
wrote:

> On Fri, 01 Apr 2011 18:23:28 +0100, Steven Schveighoffer  
> <schveiguy at yahoo.com> wrote:
>>
>> assert("" !is null); // works on D.  Try it.
>
> Yes, but that's because this is a string literal.  It's not useful where  
> you're getting your input from somewhere else.. like in the other 2 use  
> cases I mentioned.

But that isn't the same as [].  Basically, if you have an existing array,  
and you want to create a non-null empty array out of it, a slice of [0..0]  
always works.

I know you mention it, but I want to draw attention to the original  
problem, that [] returns a null array.  Other cases where you are not  
using [] or "" are a separate issue.

All the cases you have brought up involve strings, for which there is a  
non-null array returned for "".  I still have not yet seen a compelling  
use case for making [] return non-null.

> The other use case may be a little more problematic depending on the  
> method used to read the input from the keyboard, IIRC one of the methods  
> returns null for a blank line of input, which I would have to detect and  
> 'fix' using emptyArray if I wanted to pass it to something that cares  
> about the distinction.

That is up to the implementation of that function.  D provides ways to  
return an empty array that does not have a null pointer.

>> It's one thing to want an array with a non-null pointer, but it's  
>> another thing entirely to want an array with a non-null pointer which  
>> points to a valid heap address.
>
> I don't specifically want either of those things.  I just want _some  
> way_ to represent 'exists but is empty' and for it to be different to  
> 'does not exist'.  Currently D's arrays cannot do that, yet a plain old  
> pointer can.

Of course they can, you can check for null vs empty using "is null" or  
".empty".

The issue you may have is that phobos does not always care about  
preserving this distinction.  One exmaple is dup.  It is pointless to dup  
an empty array (even if non-null) by creating a heap allocation, so it  
just returns null.

>> In my opinion, [] means empty array.  I don't care what the pointer is,  
>> as long as the array is empty.  The implementation can put whatever  
>> value it wants for the pointer.  If it wants to put null, that is  
>> fine.  null means I want a null pointer.
>> If I had it my way, all array literals would be immutable, and the  
>> pointers would point to ROM (even empty ones).  We should not be  
>> constructing array literals at runtime.  But my opinion is still that  
>> you should not count on the pointer being anything because
>> it's not specified what it is.
>
> Sure, I agree with all that, but I still want some way of representing  
> both states and detecting both states and the problem is that if the  
> language cannot do it at a fundamental level, and requires some weird  
> hack or reliance on string literals then when I use any 3rd party  
> library, or phobos itself it will tell me null and I will have to guess  
> which state it actually means and 'fix' it manually.

The array has the ability to store whether it's null and empty or just  
empty, you are just expecting every function to care about that  
distinction, which most don't.

>>> That seems to work, but it's hideous syntax for something that is not  
>>> that uncommon IMO.
>>  My opinion is that it is uncommon, but it can be abstracted:
>>  template emptyArray(T)
>> {
>>    enum emptyArray = (cast(T*)0)[1..1];
>> }
>>  rename as desired.
>
> Not useful if you're getting your input from somewhere else, vs trying  
> to create a new empty array.

Again, not relevant.  Getting an empty-but-not-null array from a  
non-null-non-empty array is trivial.  This whole thread is about [].

> That said, if I were to want this I'd use the literal instead as it  
> seems safer, eg.
>
> template emptyArray(T)
> {
>     enum emptyArray = cast(T[])""[0..0];
> }

Either way should be safe.  Nothing should use data outside the array  
bounds.

>> This code seems to disagree with your results for case 5 (dmd 2.052):
>>
>>      auto x = cast(char[])""[0..0];
>>      assert(x.ptr != null); // no failure
>
> Nope, my case 5 had a 'dup' which you're missing.  If I add a new case  
> returning a literal as you have there I get the same result as you.  I  
> was intentionally avoiding the literal because I knew it would be  
> non-null (I believe D null terminates literals) and because I want to be  
> able to detect these states on more than just empty string literals.

Quoting from your message previously (with added comment):

	case 4:
		return cast(char[])"".dup;
	case 5:
		return cast(char[])""[0..0]; // note lack of .dup
	}

-Steve