output ranges: by ref or by value?
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Fri Jan 1 13:49:45 PST 2010
Jason House wrote:
> Andrei Alexandrescu Wrote:
>
>> Jason House wrote:
>>> Andrei Alexandrescu wrote:
>>>
>>>> Philippe Sigaud wrote:
>>>>> On Thu, Dec 31, 2009 at 16:47, Michel Fortin
>>>>> <michel.fortin at michelf.com
>>>>> <mailto:michel.fortin at michelf.com>> wrote:
>>>>>
>>>>> On 2009-12-31 09:58:06 -0500, Andrei Alexandrescu
>>>>> <SeeWebsiteForEmail at erdani.org
>>>>> <mailto:SeeWebsiteForEmail at erdani.org>> said:
>>>>>
>>>>> The question of this post is the following: should output
>>>>> ranges be passed by value or by reference? ArrayAppender uses
>>>>> an extra indirection to work properly when passed by value.
>>>>> But if we want to model built-in arrays' operator ~=, we'd
>>>>> need to request that all output ranges be passed by
>>>>> reference.
>>>>>
>>>>>
>>>>> I think modeling built-in arrays is the way to go as it makes
>>>>> less things to learn. In fact, it makes it easier to learn
>>>>> ranges because you can begin by learning arrays, then
>>>>> transpose this knowledge to ranges which are more abstract
>>>>> and harder to grasp.
>>>>>
>>>>>
>>>>> I agree. And arrays may well be the most used range anyway.
>>>> Upon more thinking, I'm leaning the other way. ~= is a quirk of
>>>> arrays motivated by practical necessity. I don't want to
>>>> propagate that quirk into ranges. The best output range is one
>>>> that works properly when passed by value.
>>> I worry about a growing level of convention with ranges. Another
>>> recent range thread discussed the need to call consume after a
>>> successful call to startsWith. If I violated convention and had
>>> a range class, things would fail miserably. There would be no
>>> need to consume after a successful call to startsWith and the
>>> range would have a random number of elements removed on an
>>> unsuccessful call to startsWith. I'm pretty sure that early
>>> discussions of ranges claimed that they could be either structs
>>> and classes, but in practice that isn't the case.
>> I am implementing right now a change in the range interface
>> mentioned in
>> http://www.informit.com/articles/printerfriendly.aspx?p=1407357,
>> namely: add a function save() that saves the iteration state of a
>> range.
>>
>> With save() in tow, class ranges and struct ranges can be used the
>> same way. True, if someone forgets to say
>>
>> auto copy = r.save();
>>
>> and instead says:
>>
>> auto copy = r;
>>
>> the behavior will indeed be different for class ranges and struct
>> ranges.
>
> Or if they completely forgot that bit of convention and omit creating
> a variable called save... Also, doesn't use of save degrade
> performance for structs? Or does the inliner/optimizer remove the
> copy variable altogether?
It may be best to discuss this on an example:
/**
If $(D startsWith(r1, r2)), consume the corresponding elements off $(D
r1) and return $(D true). Otherwise, leave $(D r1) unchanged and
return $(D false).
*/
bool consume(R1, R2)(ref R1 r1, R2 r2)
if (isForwardRange!R1 && isInputRange!R2)
{
auto r = r1.save();
while (!r2.empty && !r.empty && r.front == r2.front) {
r.popFront();
r2.popFront();
}
if (r2.empty) {
r1 = r;
return true;
}
return false;
}
For most structs, save() is very simple:
auto save() { return this; }
For classes, save() entails creating a new object:
auto save() { return new typeof(this)(this); }
If the implementor of consume() forgets to call save(), the situation is
unpleasant albeit not catastrophic: for most struct ranges things will
continue to work, but for class ranges the function will fail to perform
to spec. I don't know how to improve on that.
Anyway, it's not entirely a convention. I'll change isForwardRange to
require the existence of save().
Andrei
More information about the Digitalmars-d
mailing list