Transience of .front in input vs. forward ranges

Wed Nov 7 12:31:13 PST 2012

Le 07/11/2012 19:24, H. S. Teoh a écrit :
> On Tue, Nov 06, 2012 at 10:03:56PM +0100, deadalnix wrote:
>> Le 06/11/2012 07:49, Andrei Alexandrescu a écrit :
>>> On 11/6/12 4:36 AM, H. S. Teoh wrote:
>>>> Hmm. Another idea just occurred to me. The basic problem here is
>>>> that we are conflating two kinds of values, transient and
>>>> persistent, under a single name .front. What if we explicitly name
>>>> them? Say, .copyFront for the non-transient value and .refFront for
>>>> the transient value (the names are unimportant right now, let's
>>>> consider the semantics of it).
>>>
>>> We could transfer that matter to the type of .front itself, i.e.
>>> define a function copy(x) that returns e.g. x for for string and
>>> x.dup for char[] etc. There would be problems on e.g. defining copy
>>> for structs with pointer and class reference fields etc.
>>>
>>> One quite simple approach would be to define (on the contrary)
>>> .peekFront, which means "yeah, I'd like to take a peek at the front
>>> but I don't plan to store it anywhere". That would entail we define
>>> eachLine etc. to return string from .front and char[] from
>>> .peekFront, and deprecate byLine.
> [...]
>> Is it possible to have the pro and cons of peekFront vs transient ?
>
> I'll give it a shot, since nobody else is replying:
>
> Both .peekFront and .transient require at least existing transient
> ranges to be modified, as well as algorithms that can take advantage of
> them.
>
> - For .peekFront, existing transient ranges' .front is just renamed to
>    .peekFront, and we add a new .front which returns the .dup (or
>    otherwise copy) of .peekFront.  For .transient, we need to write a
>    .transient method that returns a wrapper struct whose .front is
>    transient.
>     * So .peekFront is slightly simpler in this case.
>

Agreed !

> - For .peekFront, algorithms that can handle transience can simply have
>    all occurrences of .front replaced with .peekFront. Whereas for
>    .transient, they will need to modified to explicitly call .transient,
>    save that range, and use that instead of the passed-in range.
>     * So .peekFront is slightly simpler in this case.
>

.transient seems simpler to me in this case. Adding one line is easier 
than replacing all occurrences of something.

> - For .peekFront, algorithms that *can't* handle transience will have to
>    be rewritten. Same goes for .transient. The redeeming factor in both
>    cases is that algorithms that aren't rewritten yet will continue to
>    work as before, just a bit slower. They will also begin to work
>    correctly with transient ranges even _before_ being rewritten, whereas
>    right now they're broken.
>     * Here, .peekFront and .transient are equal in terms of effort
>       required.
>

Indeed.

> - For .peekFront, the .front of *any* range is guaranteed to be
>    non-transient, so we never have to worry about whether a particular
>    range's .front is transient or not. User code has no possibility of
>    passing a transient range into an algorithm that cannot handle it.
>    For .transient, the .front of ranges is non-transient by default, but
>    once you call .transient on it, you get a range whose .front is
>    transient. There is a small possibility of wrong user code that passes
>    a .transient range to an algorithm that can't handle it.
>     * So .peekFront is slightly safer on this point.
>
> These are the points that I can think of, off the top of my head. Are
> there any others?
>
> Either way, both .peekFront and .transient requires rewriting currently
> transient ranges (which AFAIK is only byLine) and any algorithms that
> should be able to handle transience (there are a few I can think of,
> like std.algorithm.joiner, std.array.join, and maybe NWayUnion and
> writeln&  co.). We can't do better than this if we're going to support
> transience at all.
>
> I note, though, that the list of algorithms that need to be updated is
> shorter (perhaps significantly so) than the list of currently-broken
> algorithms that I posted earlier, because many of the algorithms in that
> list *cannot* be implemented to handle transience. E.g. there is no way
> to implement uniq on a transient input range since there's no way to
> save the previous value seen, so there's no way to compare two adjacent
> values. Likewise, findAdjacent cannot be implemented for transient
> ranges for the same reason. Once these are eliminated from the list,
> there should only be a few algorithms left. (And I already have a
> version of joiner that works correctly with transient ranges.)
>

OK, overall, .peekFront seems like a good idea. Something I'm afraid 
with .peekFront is code duplication.

Let's take the joiner example. Joiner's transientness depend on its 
source transientness. This seems to me like a very common case for 
transformer ranges. If we choose the .peekFront, the naive thing to do 
is to implement the same algorithm twice, using front and peekFront, 
which is code duplication, and usually a bad idea.

Is a solution known here ? Could alias template parameter help ? I'm not 
sure what is the solution, but if one exist, I'm sold for .peekFront ( 
.transientFront seems like a better name as peek have other meaning in 
other languages ).