RFC on range design for D2
Denis Koroskin
2korden at gmail.com
Mon Sep 8 16:16:18 PDT 2008
On Tue, 09 Sep 2008 01:50:54 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> wrote:
> Hello,
>
>
> Walter, Bartosz and myself have been hard at work trying to find the
> right abstraction for iteration. That abstraction would replace the
> infamous opApply and would allow for external iteration, thus paving the
> way to implementing real generic algorithms.
>
> We considered an STL-style container/iterator design. Containers would
> use the newfangled value semantics to enforce ownership of their
> contents. Iterators would span containers in various ways.
>
> The main problem with that approach was integrating built-in arrays into
> the design. STL's iterators are generalized pointers; D's built-in
> arrays are, however, not pointers, they are "pairs of pointers" that
> cover contiguous ranges in memory. Most people who've used D gained the
> intuition that slices are superior to pointers in many ways, such as
> easier checking for validity, higher-level compact primitives,
> streamlined and safe interface. However, if STL iterators are
> generalized pointers, what is the corresponding generalization of D's
> slices? Intuitively that generalization should also be superior to
> iterators.
>
> In a related development, the Boost C++ library has defined ranges as
> pairs of two iterators and implemented a series of wrappers that accept
> ranges and forward their iterators to STL functions. The main outcome of
> Boost ranges been to decrease the verboseness and perils of naked
> iterator manipulation (see
> http://www.boost.org/doc/libs/1_36_0/libs/range/doc/intro.html). So a
> C++ application using Boost could avail itself of containers, ranges,
> and iterators. The Boost notion of range is very close to a
> generalization of D's slice.
>
> We have considered that design too, but that raised a nagging question.
> In most slice-based D programming, using bare pointers is not necessary.
> Could then there be a way to use _only_ ranges and eliminate iterators
> altogether? A container/range design would be much simpler than one also
> exposing iterators.
>
> All these questions aside, there are several other imperfections in the
> STL, many caused by the underlying language. For example STL is
> incapable of distinguishing between input/output iterators and forward
> iterators. This is because C++ cannot reasonably implement a type with
> destructive copy semantics, which is what would be needed to make said
> distinction. We wanted the Phobos design to provide appropriate answers
> to such questions, too. This would be useful particularly because it
> would allow implementation of true and efficient I/O integrated with
> iteration. STL has made an attempt at that, but istream_iterator and
> ostream_iterator are, with all due respect, a joke that builds on
> another joke, the iostreams.
>
> After much thought and discussions among Walter, Bartosz and myself, I
> defined a range design and reimplemented all of std.algorithm and much
> of std.stdio in terms of ranges alone. This is quite a thorough test
> because the algorithms are diverse and stress-test the expressiveness
> and efficiency of the range design. Along the way I made the interesting
> realization that certain union/difference operations are needed as
> primitives for ranges. There are also a few bugs in the compiler and
> some needed language enhancements (e.g. returning a reference from a
> function); Walter is committed to implement them.
>
> I put together a short document for the range design. I definitely
> missed about a million things and have been imprecise about another
> million, so feedback would be highly appreciated. See:
>
> http://ssli.ee.washington.edu/~aalexand/d/tmp/std_range.html
>
>
> Andrei
1) There is a typo:
// Copies a range to another
void copy(R1, R2)(R1 src, R2 tgt)
{
while (!src.isEmpty)
{
tgt.putNext(r.getNext); // should be tgt.putNext(src.getNext);
}
}
2) R.next and R.pop could have better names. I mean, they represent
similar operations yet names are so different.
3) Walter mentioned that built-in array could be re-implemented using a
pair of pointers instead of ptr+length. Will it ever get a green light? It
fits range concept much better.
4) We need some way of supporting dollar notation in user containers. The
hack of using __dollar is bad (although it works).
5) I don't quite like names left and right! :) I think they should
represent limits (pointers to begin and end, in case of array) rather that
values. In this case, built-in arrays could be implemented as follows:
struct Array(T)
{
T* left;
T* right;
size_t length() { return right-left; }
ref T opIndex(size_t index) { return left[index]; }
// etc
}
The rationale behind having access to range limits is to allow operations
on them. For example,
R.left-=n;
could be used instead of
foreach(i; 0..n) {
R.pop();
}
which is more efficient in many cases.
Other that that - great, I like it.
More information about the Digitalmars-d-announce
mailing list