Transience of .front in input vs. forward ranges

Wed Nov 7 10:24:50 PST 2012

On Tue, Nov 06, 2012 at 10:03:56PM +0100, deadalnix wrote:
> Le 06/11/2012 07:49, Andrei Alexandrescu a écrit :
> >On 11/6/12 4:36 AM, H. S. Teoh wrote:
> >>Hmm. Another idea just occurred to me. The basic problem here is
> >>that we are conflating two kinds of values, transient and
> >>persistent, under a single name .front. What if we explicitly name
> >>them? Say, .copyFront for the non-transient value and .refFront for
> >>the transient value (the names are unimportant right now, let's
> >>consider the semantics of it).
> >
> >We could transfer that matter to the type of .front itself, i.e.
> >define a function copy(x) that returns e.g. x for for string and
> >x.dup for char[] etc. There would be problems on e.g. defining copy
> >for structs with pointer and class reference fields etc.
> >
> >One quite simple approach would be to define (on the contrary)
> >.peekFront, which means "yeah, I'd like to take a peek at the front
> >but I don't plan to store it anywhere". That would entail we define
> >eachLine etc. to return string from .front and char[] from
> >.peekFront, and deprecate byLine.
[...]
> Is it possible to have the pro and cons of peekFront vs transient ?

I'll give it a shot, since nobody else is replying:

Both .peekFront and .transient require at least existing transient
ranges to be modified, as well as algorithms that can take advantage of
them.

- For .peekFront, existing transient ranges' .front is just renamed to
  .peekFront, and we add a new .front which returns the .dup (or
  otherwise copy) of .peekFront.  For .transient, we need to write a
  .transient method that returns a wrapper struct whose .front is
  transient.
   * So .peekFront is slightly simpler in this case.

- For .peekFront, algorithms that can handle transience can simply have
  all occurrences of .front replaced with .peekFront. Whereas for
  .transient, they will need to modified to explicitly call .transient,
  save that range, and use that instead of the passed-in range.
   * So .peekFront is slightly simpler in this case.

- For .peekFront, algorithms that *can't* handle transience will have to
  be rewritten. Same goes for .transient. The redeeming factor in both
  cases is that algorithms that aren't rewritten yet will continue to
  work as before, just a bit slower. They will also begin to work
  correctly with transient ranges even _before_ being rewritten, whereas
  right now they're broken.
   * Here, .peekFront and .transient are equal in terms of effort
     required.

- For .peekFront, the .front of *any* range is guaranteed to be
  non-transient, so we never have to worry about whether a particular
  range's .front is transient or not. User code has no possibility of
  passing a transient range into an algorithm that cannot handle it.
  For .transient, the .front of ranges is non-transient by default, but
  once you call .transient on it, you get a range whose .front is
  transient. There is a small possibility of wrong user code that passes
  a .transient range to an algorithm that can't handle it.
   * So .peekFront is slightly safer on this point.

These are the points that I can think of, off the top of my head. Are
there any others?

Either way, both .peekFront and .transient requires rewriting currently
transient ranges (which AFAIK is only byLine) and any algorithms that
should be able to handle transience (there are a few I can think of,
like std.algorithm.joiner, std.array.join, and maybe NWayUnion and
writeln & co.). We can't do better than this if we're going to support
transience at all.

I note, though, that the list of algorithms that need to be updated is
shorter (perhaps significantly so) than the list of currently-broken
algorithms that I posted earlier, because many of the algorithms in that
list *cannot* be implemented to handle transience. E.g. there is no way
to implement uniq on a transient input range since there's no way to
save the previous value seen, so there's no way to compare two adjacent
values. Likewise, findAdjacent cannot be implemented for transient
ranges for the same reason. Once these are eliminated from the list,
there should only be a few algorithms left. (And I already have a
version of joiner that works correctly with transient ranges.)

T

-- 
Windows: the ultimate triumph of marketing over technology. -- Adrian von Bidder