is ==

Mon May 21 19:20:02 UTC 2018

On Monday, May 21, 2018 14:40:24 Steven Schveighoffer via Digitalmars-d-
learn wrote:
> On 5/21/18 2:05 PM, Jonathan M Davis wrote:
> > The core problem here is that no one reading a piece of code has any way
> > of knowing whether the programmer knew what they were doing or not when
> > using == null with an array, and the vast majority of newbies are not
> > going to have understood the semantics properly. If I know that someone
> > like you or Andrei wrote the code, then the odds are good that what the
> > code does is exactly what you intended. But for the average D
> > programmer? I don't think that it makes any sense to assume that,
> > especially since anyone coming from another language is going to assume
> > that == null is checking for null, when it's not.
>
> For me, the code smell is using arr is null (is it really necessary to
> check for a null pointer here?), for which I always have to look at more
> context to see if it's *really* right.

Really? I would never expect anyone to use is unless they really cared about
whether array was null. I'd be concerned about whether the code in general
was right, because treating null as special gets tricky, but that particular
line wouldn't concern me.

> Even people who write == null may want to check for null thinking that
> it's how you check an array is empty, not realizing that it *doesn't*
> check for a null pointer, *AND* it still does exactly what they need it
> to do ;)

You honestly expect someone first coming to D expect to check whether an
array is empty by checking null? That's a bizarre quirk of D that I have
never seen anyhwere else. I would never expect anyone to purposefully use
== null to check for empty unless they were very familiar with D, and even
then, I'd normally expect them to ask what they really mean, which is
whether the array is empty.

> > It's the same reason that
> >
> > if(arr)
> >
> > was temporarily out of the language.
>
> It's similar, but I consider it a different reason. While the intent of
> == null may not be crystal clear, 99% of people don't care about the
> pointer, they just care whether it's empty. So the default case is
> usually good enough, even if you don't know the true details.

I think that that's the key point of disagreement here. I would never
consider the intent of == null to be crystal clear based solely on the code,
because it is so common outside of D to use == null to actually check for
null, and there are better ways in D to check for empty if that's what you
really mean. My immediate expectation on seeing arr == null is that the
programmer does not properly understand arrays in D. If I knew that someone
like you wrote the code, I'd probably decide that you knew what you were
doing and didn't make a mistake, but I'm not going to assume that in
general, and honestly, I would consider it bad coding practice (though we
obviously disagree on that point).

I would consider the if(arr) and arr == null cases to be exactly the same.
They both are red flags that the person in question does not understand how
arrays in D work. Yes, someone who knows what they're doing may get it
right, but I'd consider both to be code smells and I wouldn't purposefully
do either in my own code. If I found either in my own code, I would expect
that I'd just found a careless bug.

> > At this point, I'm honestly inclined to think that we never should have
> > allowed null for arrays. We should have taken the abstraction a bit
> > further and disallowed using null to represent dynamic arrays. It would
> > then presumably still work to do arr.ptr is null, but arr is null
> > wouldn't work, because null wouldn't be an array, and arr == null
> > definitely wouldn't work. Then we could just use [] for empty arrays
> > everywhere, and there would be no confusion, leaving null for actual
> > pointers. And it would almost certinly kill off all of the cases where
> > null was treated as special for dynamic arrays except maybe for when
> > dealing with C code, but in that case, they'd have to use ptr directly.
> > However, at this point, I expect that that's all water under the
> > bridge, and we're stuck with it.
>
> If we never had null be the default value for an array, and used []
> instead, I would be actually OK with that. I also feel one of the
> confusing things for people coming to the language is that arrays are
> NOT exactly reference types, even though null can be used as a value for
> assignment or comparison.
>
> But it still wouldn't change what most people write or mean, they just
> would write == [] instead of == null. I don't see how this would solve
> any of your concerns.

It would solve the concern, because no one is going to write arr == [] to
check for null. They'de write it just like they'd write arr == "". They're
clearly checking for empty, not null. The whole problem here is that pretty
much everywhere other than D arrays, null and empty are two separate things,
and pretty much anyone coming from another language will expect them to be
different. It wouldn't surprise me at all to see a newbie D programmer doing
something like

if(arr != null && arr == arr2)
{...}

I would never expect anyone coming from another language to use arr == null
with the idea that it's actually checking for null, and given how confusing
dynamic arrays are for many people, it wouldn't surprise me for someone who
has programmed in D for a while to not properly understand the situation. At
some point, they learn, but it's clearly one of those topics that confuses
pretty much everyone at first.

And out of those who do understand how D dynamic arrays work, a number of
them continue to distinguish between null and empty arrays in their code -
e.g. folks like Andrei and Vladimir who write code that uses

if(arr)

and means it the way the language means it. The core problem is that D
treats null arrays as empty. If it would either treat them as actually null
(with all of the segfaults that go with that) or not treat null as a dynamic
array, then that whole problem goes away. So, if null were not a dynamic
array in any shape or form, and you had to use [] to indicate an empty
array, then that would solve my main concerns with null and dynamic arrays.

Now, that then leaves the issue of folks accessing ptr and treating null as
special there, but if you had to actually access ptr to do that, I suspect
that the practice of treating null arrays as special would go away. But even
if it didn't, the cases where someone was trying to do that would then be
clear, because they'd have to access ptr directly, so it would almost
certainly be something that only folks who knew what they were doing would
do much with, whereas arr == null is something that pretty much any D newbie
is going to try and screw up.

- Jonathan M Davis