RFC on range design for D2
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Mon Sep 8 14:50:54 PDT 2008
Hello,
Walter, Bartosz and myself have been hard at work trying to find the
right abstraction for iteration. That abstraction would replace the
infamous opApply and would allow for external iteration, thus paving the
way to implementing real generic algorithms.
We considered an STL-style container/iterator design. Containers would
use the newfangled value semantics to enforce ownership of their
contents. Iterators would span containers in various ways.
The main problem with that approach was integrating built-in arrays into
the design. STL's iterators are generalized pointers; D's built-in
arrays are, however, not pointers, they are "pairs of pointers" that
cover contiguous ranges in memory. Most people who've used D gained the
intuition that slices are superior to pointers in many ways, such as
easier checking for validity, higher-level compact primitives,
streamlined and safe interface. However, if STL iterators are
generalized pointers, what is the corresponding generalization of D's
slices? Intuitively that generalization should also be superior to
iterators.
In a related development, the Boost C++ library has defined ranges as
pairs of two iterators and implemented a series of wrappers that accept
ranges and forward their iterators to STL functions. The main outcome of
Boost ranges been to decrease the verboseness and perils of naked
iterator manipulation (see
http://www.boost.org/doc/libs/1_36_0/libs/range/doc/intro.html). So a
C++ application using Boost could avail itself of containers, ranges,
and iterators. The Boost notion of range is very close to a
generalization of D's slice.
We have considered that design too, but that raised a nagging question.
In most slice-based D programming, using bare pointers is not necessary.
Could then there be a way to use _only_ ranges and eliminate iterators
altogether? A container/range design would be much simpler than one also
exposing iterators.
All these questions aside, there are several other imperfections in the
STL, many caused by the underlying language. For example STL is
incapable of distinguishing between input/output iterators and forward
iterators. This is because C++ cannot reasonably implement a type with
destructive copy semantics, which is what would be needed to make said
distinction. We wanted the Phobos design to provide appropriate answers
to such questions, too. This would be useful particularly because it
would allow implementation of true and efficient I/O integrated with
iteration. STL has made an attempt at that, but istream_iterator and
ostream_iterator are, with all due respect, a joke that builds on
another joke, the iostreams.
After much thought and discussions among Walter, Bartosz and myself, I
defined a range design and reimplemented all of std.algorithm and much
of std.stdio in terms of ranges alone. This is quite a thorough test
because the algorithms are diverse and stress-test the expressiveness
and efficiency of the range design. Along the way I made the interesting
realization that certain union/difference operations are needed as
primitives for ranges. There are also a few bugs in the compiler and
some needed language enhancements (e.g. returning a reference from a
function); Walter is committed to implement them.
I put together a short document for the range design. I definitely
missed about a million things and have been imprecise about another
million, so feedback would be highly appreciated. See:
http://ssli.ee.washington.edu/~aalexand/d/tmp/std_range.html
Andrei
More information about the Digitalmars-d-announce
mailing list