Just a friendly reminder about using arrays in boolean conditions

Mon Nov 18 23:25:05 UTC 2024

On Monday, November 18, 2024 8:39:38 AM MST Dom DiSc via Digitalmars-d wrote:
> On Monday, 18 November 2024 at 12:24:01 UTC, user1234 wrote:
> >> ```d
> >> void v(string s)
> >> {
> >>
> >>     if (s.length)           writeln("case length :`", s, "`");
> >>     else if (s is null)     writeln("case null :`", s,  "`");
> >>     else                    writeln("case not null but no
> >>
> >> length:`", s,  "`");
> >> }
>
> One should always first check for null and then for length. This
> should be immediately clear, as asking for a length doesn't make
> sense if something is null.
> Ok, an array has both a pointer and a length, but I would never
> expect the length to contain something useful, if the pointer is
> not assigned a legal address.

It's really the opposite. If the length is 0, there's no need to look at the
ptr member, and it could be anything. In the case of [] or null arrays, it's
going to be null, and that works perfectly fine in D, because all accesses
to the array do bounds checking, so if the length is 0, you will never
dereference the pointer no matter what it is. This means that you really
don't need to worry at all about the pointer being null.

Outside of cases where you're doing something like passing the ptr field to
an extern(C) function, you really shouldn't care at all whether ptr is null,
and it's a definite code smell if code does check for null. D's arrays have
essentially eliminated the need to worry about null at all, whereas
languages that use a pointer for arrays (e.g. C/C++) or use what's
essentially a class reference (e.g. Java) have to worry about null, because
if the array is null, they can't actually do anything with it. D does not
have that problem, because we've put the length on the stack next to the ptr
field. That approach also makes it possible to slice an array to get another
array, which can be really nice.

> If I were to implement arrays, I would use a simple pointer, and
> length would be the first element of the allocated block. For any
> object, I have always in mind that it could be implemented in
> this way, so would never access anything as long as the pointer
> is not checked first.

But why would you need to care what the pointer was if the length was 0?
There's no reason to ever dereference it in that case. It _really_ matters
whether the pointer is null when that pointer is your entire access to the
array, but that goes away when you have access to the length without needing
to dereference the pointer. At that point, the value of the ptr really only
matters when the length is greater than zero, because then you have elements
to access via the pointer. But if it's 0, there are no elements, and you can
entirely ignore the value of the pointer.

Also, making it so that the length is allocated with the elements would make
it so that you couldn't slice arrays. If you wanted a subset of its
elements, you'd be forced to either copy the elements or use a wrapper type
which had its own length or pair of indices.

- Jonathan M Davis