Array length & allocation question

Wed Jun 14 09:51:04 PDT 2006

Oskar Linde wrote:
> Bruno Medeiros skrev:
>> Oskar Linde wrote:
>>>
>>> Like this:
>>>
>>> void foo(char[] arr) {
>>>     if (!arr)
>>>         writefln("Uninitialized array passed");
>>>     else if (arr.length == 0)
>>>         writefln("Zero length array received");
>>> }
>>>
>>> /Oskar
>>
>> This is not safe to do. Currently in D null arrays and zero-length 
>> arrays are conceptually the same. It just so happens that sometimes 
>> the arr.ptr is null and sometimes not, depending on the previous 
>> operations.
>> The "A 'dup'ed empty string is now a null string." is an example of 
>> why that is not safe. I thought you knew this already? This is nothing 
>> new.
> 
> Yeah, I knew about that. I did mot mean to imply that D is flawless in 
> this regard. The cases given were:
> 
> foo(""); and char[] s; foo(s);
> 
> And for those, the above function works. My only point, if I had one, 
> was that there are differences between zero length arrays and null 
> arrays in some cases in D.
> 
>> BTW, I do find it (at first sight at least) unnatural that a null 
>> array is the same as a zero-length arrays. It doesn't seem 
>> conceptually right/consistent.
> 
> In my view, D's dynamic arrays are quite different from a conceptually 
> ideal array.
> 
> Conceptually, I see an array as an ordered collection of elements. The 
> elements belong to (or are part of) the array.
> 
> One could imagine such arrays as both value and reference types. For a 
> reference type ideal array, there has to be a clear difference between 
> null and zero length. A value type ideal array on the other hand would 
> not need one such distinction.
> 
> Another conceptual entity apart from an array is an array view. An array 
> view refers to a selection of indices of another array. For example, a 
> range of indices (aka a slice). An array view may or may not remain 
> valid when the referred array changes.
> 
> D's dynamic array is quite far from my ideal array. Both its reference 
> and its value version. A closer match is actually a by-value array slice.
> 
> Does it make sense for a by-value array slice type to discriminate 
> between null and zero-length? I would say that it has its uses. For 
> example, a regexp could match a zero length portion of a string. It is 
> still important to know where in the string the match was made.
> 
> D's arrays have both the role of a non-reference array and of an array 
> slice. In the role of an non-reference array, it makes sense that null 
> is equivalent to zero-length. In the role of an array slice on the other 
> hand, it does make sense to discriminate between zero length and null. 
> There are other differences. Appending elements only makes sense to the 
> array role, not the slice role. dup creates an array from a slice or an 
> array. It therefore makes sense that dup returns null on zero length 
> arrays.
> 
> The semantics of some operations depends on the role the array has. D 
> has no way of knowing, so it guesses. Take that with a grain of salt, 
> but operations on arrays depend on a runtime judgment by the gc.
> 
> Take the append operation. Appending elements to a D array that is in 
> the array role makes sense and works like a charm. Appending elements to 
> an array slice doesn't make any sense, but D will create a new array 
> with copies of the elements the slice refers to and append the element 
> to that array. The slice has been transformed into an array.
> 
> But how does D know when an array is in the slice role or the array 
> role? It doesn't. Here is where the (educated) guess comes in. Any array 
> that starts at the beginning of a gc chunk is assumed to be an array. 
> Otherwise, it is assumed to be a slice. The implications are:
> 
> char[] mystr = "abcd".dup;
> char[] slice1 = mystr[0..1];
> char[] slice2 = mystr[1..2];
> slice1 ~= "x"; // alters the original mystr
> slice2 ~= "y"; // doesn't alter the original
> 

Well, those new thing you mentioned are actually very related with 
ownership management, and reference/object immutibility, than to just 
arrays itself.

> I've written too much nonsense now. Some condensed conclusions:
> 
> - D's arrays have a schizophrenic nature (slice vs array)
> - The compiler is unable to tell the difference and can't protect you 
> against mistakes
> - D arrays are not self documenting:
> 
> char[] foo(); // <- returns an array or a slice of someone else's array?
> 
> /Oskar

We have often mentioned the problems of arrays (both static and dynamic) 
before. It should be brought under discussion to the "general" D public 
eventually. (although for me preferably not soon, other things to take care)

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D