Empty VS null array?

Fri Oct 18 11:04:41 PDT 2013

On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:
> On Fri, Oct 18, 2013 at 01:32:58PM -0400, Jonathan M Davis wrote:
> > On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
> > > On 10/18/13 9:26 AM, Max Samukha wrote:
> > > > *That's* bad API design. readln should be symmetrical to writeln,
> > > > not write. And about preserving the exact representation of new
> > > > lines, readln/writeln shouldn't preserve that, pure and simple.
> > > 
> > > Fair point. I just gave one possible alternative out of many. Thing
> > > is, relying on client code to distinguish subtleties between empty
> > > and null strings is fraught with dangers.
> > 
> > Yeah, but the primary reason that it's bad design is the fact that D
> > tries to conflate null and empty instead of keeping them distinct
> > (which is essentially the complaint that was made). Whether that's
> > ultimately good or bad is up for debate, but the side effect is that
> > relying on the difference between null and empty ends up being very
> > bug-prone, whereas in other languages which don't conflate the two, it
> > isn't problematic in the same way, and it's much more reasonable to
> > have the API treat them differently.
> 
> [...]
> 
> IMO, distinguishing between null and empty arrays is bad abstraction. I
> agree with D's "conflation" of null with empty, actually. Conceptually
> speaking, an array is a sequence of values of non-negative length. An
> array with non-zero length contains at least one element, and is
> therefore non-empty, whereas an array with zero length is empty. Same
> thing goes with a slice. A slice is a view into zero or more array
> elements. A slice with zero length is empty, and a slice with non-zero
> length contains at least one element. There's nowhere in this conceptual
> scheme for such a thing as a "null array" that's distinct from an empty
> array. This distinction only crops up in implementation, and IMO leads
> to code smells because code should be operating based on the conceptual
> behaviour of arrays rather than on the implementation details.

In most languages, an array is a reference type, so there's the question of 
whether it's even _there_. There's a clear distinction between having null 
reference to an array and having a reference to an empty array. This is 
particularly clear in C++ where an array is just a pointer, but it's try in 
plenty of other languages that don't treat as arrays as pointers (e.g. Java).

The problem is that D put the length on the stack alongside the pointer, 
making it so that D arrays are sort of reference types and sort of not. The 
pointer is a reference type, but the length is a value type, making the 
dynamic array half and half. If it were fully a reference type, then there 
would be no problem with distinguishing between null and empty arrays. A null 
array is simply a null reference to an array. But since D arrays aren't quite 
reference types, that doesn't work.

I see no problem in the abstraction of arrays with having null arrays, because 
a null array is simply a null reference to an array, which is exactly the same 
as having a null object or null pointer. It's the reference that's null, not 
what it points to. It's just D's implementation that's weird. It would be like 
taking some of the member variables of a class and putting them in the 
reference instead of in the object and then discussing how much a null object 
makes sense. It's just bizarre.

Now, D arrays end up working great overall in spite of their semantic 
weirdness, but it does mean that you can't really have proper null arrays in 
the same way that most languages with arrays can, forcing you to either be 
extremely careful when dealing with null and arrays or to waste space doing 
stuff to keep track of nullability separately from the array itself like 
Nullable does.

- Jonathan M Davis