Array length & allocation question

Oskar Linde oskar.lindeREM at OVEgmail.com
Tue Jun 13 09:53:03 PDT 2006


Bruno Medeiros skrev:
> Oskar Linde wrote:
>>
>> Like this:
>>
>> void foo(char[] arr) {
>>     if (!arr)
>>         writefln("Uninitialized array passed");
>>     else if (arr.length == 0)
>>         writefln("Zero length array received");
>> }
>>
>> /Oskar
> 
> This is not safe to do. Currently in D null arrays and zero-length 
> arrays are conceptually the same. It just so happens that sometimes the 
> arr.ptr is null and sometimes not, depending on the previous operations.
> The "A 'dup'ed empty string is now a null string." is an example of why 
> that is not safe. I thought you knew this already? This is nothing new.

Yeah, I knew about that. I did mot mean to imply that D is flawless in 
this regard. The cases given were:

foo(""); and char[] s; foo(s);

And for those, the above function works. My only point, if I had one, 
was that there are differences between zero length arrays and null 
arrays in some cases in D.

> BTW, I do find it (at first sight at least) unnatural that a null array 
> is the same as a zero-length arrays. It doesn't seem conceptually 
> right/consistent.

In my view, D's dynamic arrays are quite different from a conceptually 
ideal array.

Conceptually, I see an array as an ordered collection of elements. The 
elements belong to (or are part of) the array.

One could imagine such arrays as both value and reference types. For a 
reference type ideal array, there has to be a clear difference between 
null and zero length. A value type ideal array on the other hand would 
not need one such distinction.

Another conceptual entity apart from an array is an array view. An array 
view refers to a selection of indices of another array. For example, a 
range of indices (aka a slice). An array view may or may not remain 
valid when the referred array changes.

D's dynamic array is quite far from my ideal array. Both its reference 
and its value version. A closer match is actually a by-value array slice.

Does it make sense for a by-value array slice type to discriminate 
between null and zero-length? I would say that it has its uses. For 
example, a regexp could match a zero length portion of a string. It is 
still important to know where in the string the match was made.

D's arrays have both the role of a non-reference array and of an array 
slice. In the role of an non-reference array, it makes sense that null 
is equivalent to zero-length. In the role of an array slice on the other 
hand, it does make sense to discriminate between zero length and null. 
There are other differences. Appending elements only makes sense to the 
array role, not the slice role. dup creates an array from a slice or an 
array. It therefore makes sense that dup returns null on zero length arrays.

The semantics of some operations depends on the role the array has. D 
has no way of knowing, so it guesses. Take that with a grain of salt, 
but operations on arrays depend on a runtime judgment by the gc.

Take the append operation. Appending elements to a D array that is in 
the array role makes sense and works like a charm. Appending elements to 
an array slice doesn't make any sense, but D will create a new array 
with copies of the elements the slice refers to and append the element 
to that array. The slice has been transformed into an array.

But how does D know when an array is in the slice role or the array 
role? It doesn't. Here is where the (educated) guess comes in. Any array 
that starts at the beginning of a gc chunk is assumed to be an array. 
Otherwise, it is assumed to be a slice. The implications are:

char[] mystr = "abcd".dup;
char[] slice1 = mystr[0..1];
char[] slice2 = mystr[1..2];
slice1 ~= "x"; // alters the original mystr
slice2 ~= "y"; // doesn't alter the original

I've written too much nonsense now. Some condensed conclusions:

- D's arrays have a schizophrenic nature (slice vs array)
- The compiler is unable to tell the difference and can't protect you 
against mistakes
- D arrays are not self documenting:

char[] foo(); // <- returns an array or a slice of someone else's array?

/Oskar



More information about the Digitalmars-d-learn mailing list