[phobos] Improving std.range.Zip

David Simcha dsimcha at gmail.com
Tue Aug 17 11:18:05 PDT 2010


Two points:

1.  I generally consider making copying arbitrarily expensive via this(this)
to be a *terrible* practice that we shouldn't bend over backwards to support
efficiently.  Using this(this) to give eager value semantics is a holdover
from C++, where objects must have a clear owner and aliasing must be
controlled due to lack of GC.  In a GC'd language containers should have
reference semantics with explicit duplication.  The proper use of this(this)
is for simple O(1) things like reference counting.  My general feeling is
that range definitions are getting too complex to support a corner case.

2.  What's the difference between moveFront(Widget rhs) and front(Widget
rhs)?  I understand the difference for the case of moveFront() vs. front(),
but not for the assignment case.

Ignoring the complexities of this(this)-related stuff, though, I definitely
like the work on sealed containers, etc.

--Dave

On Tue, Aug 17, 2010 at 1:57 PM, Andrei Alexandrescu <andrei at erdani.com>wrote:

> Apologies for the late answer (the message is from 2009!). This is about
> David Simcha's thoughts on improving std.range.Zip. In spite of its
> lateness, the response is rather timely for the ongoing discussion of
> lockstep().
>
> Zip was indeed a kludge because it required algorithms to be modified to
> recognize it. Now I think we have a general method for expressing types that
> should be assignable and swappable, yet don't expose references.
>
> The problem
> ===========
>
> We want to define read and write methods for e.g. the front of a range. In
> addition we also want to swap a front of a range with another object of the
> same type without incurring unnecessary copies. This is because copying
> objects can be arbitrarily expensive due to this(this).
>
> If all ranges exposed references, that all would be simple because access
> to objects is immediate. Yet some ranges don't want to expose references to
> their internals. I define below such ranges as "sealed ranges". Example of
> non-sealed range:
>
> struct NonSealed {
>    @property bool empty();
>    @property ref Widget front();
>    void popFront();
> }
>
> struct Sealed {
>    @property bool empty();
>    @property Widget front();
>    void popFront();
> }
>
> Solution
> ========
>
> Sealed ranges must define additional members as follows:
>
> struct Sealed {
>    @property bool empty();
>    @property Widget front();
>    @property void front(Widget rhs);
>    @property Widget moveFront(Widget rhs);
>    void popFront();
> }
>
> Similarly, if a range offers back() it should also offer back(Widget) and
> moveBack(). Finally, if a range offers opIndex(), it should also offer
> opIndeAssign(), opIndexOpAssign(), and Widget moveAt(size_t).
>
> These three routines are all that's needed for implementing a host of
> efficient algorithms, starting of course with swap. I've already changed
> std.algorithm.swap and many other functions in std.algorithm to support this
> new concept. (I think most of these changes are not yet committed.) I
> haven't yet updated e.g. Zip.
>
> I have followed a very similar pattern in std.container, which I'll check
> in soon.
>
> Any comments, please chime in.
>
>
> Andrei
>
>
> On 11/13/2009 11:08 AM, David Simcha wrote:
>
>> I was thinking about std.range.Zip and ways to improve it.  It would be
>> really nice if we could give Zip assignable elements because right now,
>> sorting of it works by a bunch of special-case ad-hockery. Furthermore,
>> Andrei has mentioned the need for a stable O(N log N) sort in Phobos.
>> The most obvious (and possibly the only) algorithm is a merge sort, but
>> this can only be done (AFAIK) if you have assignable elements, not just
>> swappable elements. Using some magic of postblits and opAssign, I think
>> we can get assignable elements to work with Zip. First of all, there
>> should be two kinds of Proxy structs: A value one and a reference one.
>> This information would be stored in a flag in each instance. The compile
>> time type would be the same. The values and the pointers would be stored
>> together in a union. It would look something like this:
>>
>> struct Proxy {
>> union {
>> staticMap!(pointers, R) ptrs;
>> R vals;
>> }
>>
>> bool isReference; // Always true if this is stored inside a Zip.
>> this(this) {
>> if(isReference) {
>> // Overwrite pointers with values.
>> isReference = false;
>> }
>> }
>>
>> void opAssign(ref Proxy rhs) {
>> if(isReference) {
>> // Assign the values of rhs.at!(0)...
>> // rhs.at!(rhs.length) to the deref
>> // ptrs.
>> } else {
>> // Assign rhs's values to vals.
>> }
>> }
>>
>> auto at(int index)() {
>> if(isReference) {
>> return *(ptrs[index]);
>> } else {
>> return vals[index];
>> }
>> }
>> }
>>
>> Then, the standard swap algorithm would work. Assuming random access:
>>
>> auto temp = myZip[index1]; // Postblit makes this a value proxy.
>> myZip[index1] = myZip[index2]; // Writes to dereferences in myZip[index1].
>> myZip[index2] = temp; // Writes to dereferences in myZip[index2].
>>
>> Furthermore, you could store Proxy structs in an array as a buffer, etc.
>> and it would (I think) just work. Yes, there would be some performance
>> cost to all this branching and postblitting, but IMHO it's worth it to
>> make Zip work the way that people who don't understand the
>> implementation details would expect it to work.
>>
>> The only problem I see is uninitialized Proxy objects:
>>
>> Proxy proxy = void;
>> proxy = myZip[5];
>>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100817/7d7a95b0/attachment.html>


More information about the phobos mailing list