[phobos] Improving std.range.Zip

Andrei Alexandrescu andrei at erdani.com
Tue Aug 17 10:57:06 PDT 2010


Apologies for the late answer (the message is from 2009!). This is about 
David Simcha's thoughts on improving std.range.Zip. In spite of its 
lateness, the response is rather timely for the ongoing discussion of 
lockstep().

Zip was indeed a kludge because it required algorithms to be modified to 
recognize it. Now I think we have a general method for expressing types 
that should be assignable and swappable, yet don't expose references.

The problem
===========

We want to define read and write methods for e.g. the front of a range. 
In addition we also want to swap a front of a range with another object 
of the same type without incurring unnecessary copies. This is because 
copying objects can be arbitrarily expensive due to this(this).

If all ranges exposed references, that all would be simple because 
access to objects is immediate. Yet some ranges don't want to expose 
references to their internals. I define below such ranges as "sealed 
ranges". Example of non-sealed range:

struct NonSealed {
     @property bool empty();
     @property ref Widget front();
     void popFront();
}

struct Sealed {
     @property bool empty();
     @property Widget front();
     void popFront();
}

Solution
========

Sealed ranges must define additional members as follows:

struct Sealed {
     @property bool empty();
     @property Widget front();
     @property void front(Widget rhs);
     @property Widget moveFront(Widget rhs);
     void popFront();
}

Similarly, if a range offers back() it should also offer back(Widget) 
and moveBack(). Finally, if a range offers opIndex(), it should also 
offer opIndeAssign(), opIndexOpAssign(), and Widget moveAt(size_t).

These three routines are all that's needed for implementing a host of 
efficient algorithms, starting of course with swap. I've already changed 
std.algorithm.swap and many other functions in std.algorithm to support 
this new concept. (I think most of these changes are not yet committed.) 
I haven't yet updated e.g. Zip.

I have followed a very similar pattern in std.container, which I'll 
check in soon.

Any comments, please chime in.


Andrei

On 11/13/2009 11:08 AM, David Simcha wrote:
> I was thinking about std.range.Zip and ways to improve it.  It would be
> really nice if we could give Zip assignable elements because right now,
> sorting of it works by a bunch of special-case ad-hockery. Furthermore,
> Andrei has mentioned the need for a stable O(N log N) sort in Phobos.
> The most obvious (and possibly the only) algorithm is a merge sort, but
> this can only be done (AFAIK) if you have assignable elements, not just
> swappable elements. Using some magic of postblits and opAssign, I think
> we can get assignable elements to work with Zip. First of all, there
> should be two kinds of Proxy structs: A value one and a reference one.
> This information would be stored in a flag in each instance. The compile
> time type would be the same. The values and the pointers would be stored
> together in a union. It would look something like this:
>
> struct Proxy {
> union {
> staticMap!(pointers, R) ptrs;
> R vals;
> }
>
> bool isReference; // Always true if this is stored inside a Zip.
> this(this) {
> if(isReference) {
> // Overwrite pointers with values.
> isReference = false;
> }
> }
>
> void opAssign(ref Proxy rhs) {
> if(isReference) {
> // Assign the values of rhs.at!(0)...
> // rhs.at!(rhs.length) to the deref
> // ptrs.
> } else {
> // Assign rhs's values to vals.
> }
> }
>
> auto at(int index)() {
> if(isReference) {
> return *(ptrs[index]);
> } else {
> return vals[index];
> }
> }
> }
>
> Then, the standard swap algorithm would work. Assuming random access:
>
> auto temp = myZip[index1]; // Postblit makes this a value proxy.
> myZip[index1] = myZip[index2]; // Writes to dereferences in myZip[index1].
> myZip[index2] = temp; // Writes to dereferences in myZip[index2].
>
> Furthermore, you could store Proxy structs in an array as a buffer, etc.
> and it would (I think) just work. Yes, there would be some performance
> cost to all this branching and postblitting, but IMHO it's worth it to
> make Zip work the way that people who don't understand the
> implementation details would expect it to work.
>
> The only problem I see is uninitialized Proxy objects:
>
> Proxy proxy = void;
> proxy = myZip[5];


More information about the phobos mailing list