RFC on range design for D2

Sergey Gromov snake.scaly at gmail.com
Tue Sep 9 03:20:22 PDT 2008


Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> wrote:
> Sergey Gromov wrote:
> > - what's a collection?  How do you get a range out of there?  Collection 
> > should be a range itself, like an array.  But it shouldn't be destroyed 
> > after passing it to foreach().  How to achieve this if foreach() 
> > essentially uses getNext()?
> 
> These are great questions, I'm glad you asked. The way I see things, D2 
> ranges can be of two kinds: owned and unowned. For example D1's ranges 
> are all unowned:
> 
> [snip]
> 
> A better design would be to define collections that own their contents. 
> For example:
> 
> Array!(int) a(100);
> 
> This time a does own the underlying array. You'd be able to get ranges 
> all over it:
> 
> int[] b = a.all;

I really don't like to have basic language constructs implemented as 
templates.  It's like Tuple!() which is sorta basic type but requires 
template trickery to really work with it.

> So now we have two nice notions: Arrays own the data. Ranges walk over 
> that data. An array can have many ranges crawling over it. But two 
> arrays never overlap. The contents of the array will be destroyed (BUT 
> NOT DELETED) when a goes out of scope.

This invalidates the idea of safe manipulations with data no matter 
where you've got that data from.

> About the "collection should be a range itself" mantra, I've had a 
> micro-epiphany. Since D's slices so nicely model at the same time arrays 
> and their ranges, it is very seductive to think of carrying that to 
> other collection types. But I got disabused of that notion as soon as I 
> wanted to define a less simple data structure. Consider a matrix:
> 
> auto a = BlockMatrix!(float, 3)(100, 200, 300);
> 
> defines a block contiguous matrix of three dimensions with the 
> respective sizes. Now a should be the matrix AND its range at the same 
> time. But what's "the range" of a matrix? Oops. As soon as you start to 
> think of it, so many darn ranges come to mind.

If you cannot think of a natural default range for your collection---
well, it's your decision not to implement range interface for it.  But 
if it does have a natural iteration semantics, it should be possible to 
implement:

auto a = new File("name");
auto b = new TreeSet!(char);

foreach(ch; a)
  b.insert(ch);

foreach(ch; b)
    writeln("unique char ", ch);

Here is the problem.  First foreach() naturally and expectedly changes 
the state of an object a.  Second foreach() naturally and expectedly 
does not make changes to object b.

Solution:

File is an Input range in your notion.  It supports isEmpty() and 
getNext(), it is non-copyable (but, note, referenceable).

TreeSet is a Collection, which you don't discuss.  It implements opSlice
() without arguments, which is required and sufficient to define a 
collection.  opSlice() must return a range that, at least, supports 
input range operations.

foreach() checks if a passed object implements opSlice() so that it can 
iterate non-destructively. If no opSlice() is provided, it falls back to 
getNext().

> a) A matrix owns its stuff and is preoccupied with storage internals, 
> allocation, and the such.
> 
> b) The matrix defines as many range types as it wants.
> 
> c) Users use the ranges.

No problem.  The matrix lives as long as a range refers to it.  As 
expected.

> Inevitably naysayers will, well, naysay: D defined a built-in array, but 
> it also needs Array, so built-in arrays turned out to be useless. So how 
> is that better than C++ which has pointers and vector? Walter has long 
> feared such naysaying and opposed addition of user-defined array types 
> to Phobos. But now I am fully prepared to un-naysay the naysayers: 
> built-in slices are a superior building block to naked pointers. They 
> are in fact embodying a powerful concept, that of a range. With ranges 
> everything there is can be built efficiently and safely. Finally, 
> garbage collection helps by ensuring object ownership while preserving 
> well-definedness of incorrect code.

Slices need to implement random access range interface, that's all.


More information about the Digitalmars-d-announce mailing list