RFC: naming for FrontTransversal and Transversal ranges

Robert Jacques sandford at jhu.edu
Thu Apr 30 17:04:43 PDT 2009


On Thu, 30 Apr 2009 15:57:17 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:
> Robert Jacques wrote:
>> On Wed, 29 Apr 2009 20:45:23 -0400, Andrei Alexandrescu  
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>
>>> It has become increasingly clear we'll need to support arrays in
>>> addition to slices.
>>  No, Andrei it hasn't. A detailed paragraph (or more) explaining we you  
>> think so should be included in the full RFC.
>
> There are several converging pieces of evidence. Each can be debated in  
> separation, yet together they make for a rather strong case.
>
> People have complained about the incredibly bad, unsafe, and slow  
> behavior of appending. Even before the considerable recent slowdown of  
> ~=, there were speed concerns. Worse than that, the completely weird  
> behavior of slices that grow to stomp on other slices cannot be defended.

Actually, this is a problem due to an optimization in the current GC.  
Fixing it is rather trivial: simply store the maximum requested array end  
at either the beginning or end of the memory block.

> The idiom:
>
> array.length = n;
> array.length = 0;
> // append to array up to n...
>
> is rather disingenuous and looks odd to the casual reader.

My view is that this is an argument for a reserve function:
array.reserve = n;
// append to array up to n...

> As predicted by the "range theory" which predicates that a range will  
> never grow, only shrink, slices should always shrink. Growing is a  
> foreign operation for a slice. This is a big problem that goes beyond a  
> need for purity or cleanliness - it's just terrible to have a slice  
> stomping around, and then uncontrollably being rebound elsewhere and so  
> on. People pass slices into functions, functions modify them and also  
> append to them, and depending on the phase of the moon, some updates are  
> being seen in the caller.

I agree this is a problem, but the bugs are bugs in the implementation,  
not design. (see above)
But the bigger [ (x ~ y) may or may not create a new array ] resulting in  
either unseen intended updates or seen unintended updates affects both  
array and slice types.  Note that with the bugs fixed, this would only  
occur when concatenating to the logically complete array, which is why I  
say it affects both an array and slice types equally. Value semantics  
might help though.
I understand where you're coming from with "range theory", though I don't  
personally see that slices are impure: I view (x~y) as creating a  
logically new array and returning a view of it.

> So people have defined various types (such as ArrayCreator or  
> ArrayAppender or whatnot) that take care of that aspect. But really what  
> we're looking at is an Array type.

An Array type doesn't address the needs of ArrayCreator/ArrayAppender. The  
reason ArrayAppender exists is to optimize finding the array capacity.  
While this can be added to an array type, it can just as easily be added  
to a slice type. The main reason this enhancement got shot down long ago,  
was that the D community thought making arrays 3 words long, instead of 2,  
would result in a net performance loss. Personally, I'd love to see some  
real numbers proving or dis-proving this.

> Another issue is that we can get (partly) away with disowned slices  
> because of the garbage collector: whenever a slice is to be rebound,  
> whatever chunk it was bound to is left behind and will be duly  
> collected. If, however, we want to support other programming disciplines  
> that exercise stricter control over memory, the "parent" arrays must  
> become an explicit type that code can keep around and manipulate  
> directly.

I'm a little sad to hear you say this, as both Daniel Keep and I posted  
sample pseudo code on how to do slice types properly in both reference  
counted and manual managed memory systems. To reiterate:

struct T[]{
     T* ptr;
     size_t length;
     MemInfo* memInfo;
}

struct MemInfo {
     size_t refCount; //not included if not reference counting
     size_t capacity;
}

With meminfo being located at the start of the array's memory block. (i.e.  
com style)

> A design that has one container type and several slice types that crawl  
> it in various ways is very compelling, and I think it will be a huge  
> selling point for D. It would be odd, not natural, to have arrays as a  
> singularly distinct container that is in fact its own range. We get away  
> with container == range for arrays partly because arrays are a simple  
> structure, but that only blurs thinking and understanding of more  
> complex containers. In fact, ranges do predict that any attempt to grow  
> a range will cause topological issues in the owning container; that's  
> why SList wisely (IMHO :o)) includes a template parameter that tells  
> whether its topology is fixed or flexible.

I'd disagree with regard to a selling point/compelling. My lab maintains  
an open source C++ array/slice type pair as part of a larger robotics  
package. It's laid outed with an ArrayBase containing all the  
implementation. The Array type simply extends ArrayBase and adds memory  
management. And Slice extends Arraybase, only adding binding capabilities.  
And save for memory allocation, they are identical, i.e. the container and  
range are interchangeable.

> (Last but not least, .NET has arrays and slices. D's slices that  
> actually have an array testis mesh real bad with .NET because in .NET  
> you can't have a disowned slice.)

I've been following the blog. I was surprised that Cristian didn't just  
make all D arrays .NET ArraySegments and be done with it, but there's  
probably issues I don't know about. I'd like to hear more on what exactly  
the issue was.



More information about the Digitalmars-d mailing list