RFC on range design for D2
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Tue Sep 9 03:50:06 PDT 2008
Sergey Gromov wrote:
> Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> wrote:
>> Sergey Gromov wrote:
>>> - what's a collection? How do you get a range out of there? Collection
>>> should be a range itself, like an array. But it shouldn't be destroyed
>>> after passing it to foreach(). How to achieve this if foreach()
>>> essentially uses getNext()?
>> These are great questions, I'm glad you asked. The way I see things, D2
>> ranges can be of two kinds: owned and unowned. For example D1's ranges
>> are all unowned:
>>
>> [snip]
>>
>> A better design would be to define collections that own their contents.
>> For example:
>>
>> Array!(int) a(100);
>>
>> This time a does own the underlying array. You'd be able to get ranges
>> all over it:
>>
>> int[] b = a.all;
>
> I really don't like to have basic language constructs implemented as
> templates. It's like Tuple!() which is sorta basic type but requires
> template trickery to really work with it.
Well I guess we disagree on a number of issues here. The problem with
"sorta basic" types is that the list could go on forever. I'd rather use
a language that allows creation of good types from a small core, instead
of one that tries to supplant all sorta basic types it could think of.
>> So now we have two nice notions: Arrays own the data. Ranges walk over
>> that data. An array can have many ranges crawling over it. But two
>> arrays never overlap. The contents of the array will be destroyed (BUT
>> NOT DELETED) when a goes out of scope.
>
> This invalidates the idea of safe manipulations with data no matter
> where you've got that data from.
Manipulation remains typesafe. The problem is that sometimes we want to
ensure timely termination of certain resources.
>> About the "collection should be a range itself" mantra, I've had a
>> micro-epiphany. Since D's slices so nicely model at the same time arrays
>> and their ranges, it is very seductive to think of carrying that to
>> other collection types. But I got disabused of that notion as soon as I
>> wanted to define a less simple data structure. Consider a matrix:
>>
>> auto a = BlockMatrix!(float, 3)(100, 200, 300);
>>
>> defines a block contiguous matrix of three dimensions with the
>> respective sizes. Now a should be the matrix AND its range at the same
>> time. But what's "the range" of a matrix? Oops. As soon as you start to
>> think of it, so many darn ranges come to mind.
>
> If you cannot think of a natural default range for your collection---
> well, it's your decision not to implement range interface for it. But
> if it does have a natural iteration semantics, it should be possible to
> implement:
It's not that I can't think of one. It's that I think of too many.
> auto a = new File("name");
> auto b = new TreeSet!(char);
>
> foreach(ch; a)
> b.insert(ch);
>
> foreach(ch; b)
> writeln("unique char ", ch);
>
> Here is the problem. First foreach() naturally and expectedly changes
> the state of an object a. Second foreach() naturally and expectedly
> does not make changes to object b.
>
> Solution:
>
> File is an Input range in your notion. It supports isEmpty() and
> getNext(), it is non-copyable (but, note, referenceable).
You left a crucial detail out. What does getNext() return?
In the new std.stdio design, a File is preoccupied with opening,
closing, and transferring data for the underlying file. On top of it
several input ranges can be constructed - that read lines, blocks, parse
text, format text, and so on. (One thing I want is to allow
std.algorithm to work with I/O easily and naturally.)
I fully understand there can be so many design choices in handling all
this stuff, it's not even funny. I can't get excited about an equivalent
solution that to me has no obvious advantages. I can even less get
excited about a solution that I have objections with. That doesn't mean
someone else can't get excited over it, and probably rightly so.
> TreeSet is a Collection, which you don't discuss. It implements opSlice
> () without arguments, which is required and sufficient to define a
> collection. opSlice() must return a range that, at least, supports
> input range operations.
>
> foreach() checks if a passed object implements opSlice() so that it can
> iterate non-destructively. If no opSlice() is provided, it falls back to
> getNext().
>
>> a) A matrix owns its stuff and is preoccupied with storage internals,
>> allocation, and the such.
>>
>> b) The matrix defines as many range types as it wants.
>>
>> c) Users use the ranges.
>
> No problem. The matrix lives as long as a range refers to it. As
> expected.
I, too, wish reference counting is a solution to everything.
Andrei
More information about the Digitalmars-d-announce
mailing list