Range of n lines from stdin

Fri Dec 27 12:30:51 PST 2013

On Friday, 27 December 2013 at 18:32:29 UTC, Jakob Ovrum wrote:
>> (1) I can do
>> n.iota.map!(_ => readln)
>> to get the next n lines from stdin.
>
> This has several issues:
>
>  * The result claims to have all kinds of range capabilities 
> that don't make sense at all. Attempting to actually use these 
> capabilities, likely indirectly through range algorithms, can 
> cause all kinds of havoc.

Hmm?..  From my experience, attempting to use a range in a wrong 
way usually results in a compilation error.  For example, I can't 
do
n.iota.map!(_ => readln).sort())
since MapResult isn't a random access range with swappable 
elements. I can instead do
n.iota.map!(_ => readln).array().sort())
and it allocates an array and works as expected.  So, how do I 
misuse that range?

>  * It will allocate a new buffer for the read line every time 
> `front` is called, which is less granular than `byLine`'s 
> allocation behaviour.
>
>  * If `stdin` (or whatever file) only has `i` number of lines 
> left in it where `i < n`, the range will erroneously report `n 
> - i` number of empty lines at the end.
>
>  * It's not showing intent as clear as it should be.

Thank you for pointing these out!  So it's not performant, not 
correct and not idiomatic.  I understood only a part of that, but 
already asked for a better alternative.  Well, that's more 
arguments to the same point.  And yeah, stdin.byLine serves 
rather well in this particular case.

>> So, what I ask for is some non-caching repeat for functions 
>> with side effects.  More idiomatic than (1).  Is there 
>> something like that in Phobos?
>
> It's hard generalize. For one, what is the empty condition?

Hmm.  For example, that could be a RNG emitting (a range of) 
random numbers, then "empty" is always false.  But we still want 
a new random number each time.  Something like
n.iota.map!(_ => uniform(0, 10))

>> Is it an OK style to have an impure function in an UFCS chain?
>
> I assume by UFCS chain you mean range compositions in 
> particular.
>
> It's not really about purity; impure links in the chain are 
> fine (e.g. `byLine`). The issue is when the side effects are 
> the only result - I think that is very bad style, and should 
> either be rewritten in terms of return values, or rewritten to 
> use an imperative style.

So, something like
n.iota.map !(_ => readln).writeln;
is bad style but
writeln (n.iota.map !(_ => readln));
better shows what's the main action?  Makes sense for me.

>> If repeat could know whether its first argument is pure, it 
>> could then enable or disable front caching depending on 
>> purity... no way currently?
>
> `readln.repeat(n)` can also be written `repeat(readln(), n)`. 
> Maybe that makes it more obvious what it does - reads one line 
> from standard input and passes that to `repeat`, which returns 
> a range that returns that same line `n` times.

The confusion for me is this: does "repeat" mean "eagerly get a 
value once and then lazily repeat it n times" or "do what the 
first argument suggests (emit constant, call function, etc.) n 
times"?  I guess it depends on the defaults of the language.  
Currently, I had no strong preference for one definition over the 
other when I saw the name.  Maybe I would indeed prefer the first 
definition if I knew D better, I don't know.

In the first definition, the "eagerly vs. lazily" contradiction 
in my mind is what scares me off from making it the default: if 
"repeat" is a lazy range by itself, why would it treat its 
argument eagerly?  What if the argument is a lazy range itself, 
having a new value each time repeat asks for it?

The first definition makes much more sense for me when I treat it 
this way: "repeat expects its first argument to be pure (not able 
to change between calls)".

Perhaps there's a wholly different way of thinking about this in 
which the first definition makes much more sense than then second 
one from the start.  If so, please share it.

Ivan Kazmenko.