byChunk odd behavior?

cy via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Tue Mar 22 13:53:30 PDT 2016


On Tuesday, 22 March 2016 at 07:17:41 UTC, Hanh wrote:
> 	input.take(3).array;
> 	foreach (char c; input) {

Never use an input range twice. So, here's how to use it twice:

If it's a "forward range" you can use save() to get a copy to use 
later (but all the std.stdio.* ranges don't implement that). You 
can also use "std.range.tee" to send the results to an "output 
range" (something implementing put(K)(K)) while iterating over 
them.

tee can't produce two input ranges, because without caching all 
iterated items in memory, only one range can request items 
on-demand; the other must take them passively.

You could write a thing that takes an InputRange and produces a 
ForwardRange, by caching those items in memory, but at that point 
you might as well use .array and get the whole thing.

ByChunk is an input range (not a forward range), so there's 
undefined behavior when you use it twice. No bugs there, since it 
wasn't meant to be reused anyway. What it does is cache the last 
seen chunk, first iterate over that, then read more chunks from 
the file. So every time you iterate, you'll get that same last 
chunk.

It's also tricky to use input ranges after mutating their 
underlying data structure. If you seek in the file, for instance, 
then a previously created ByChunk will produce the chunk it has 
cached, and only then start reading chunks from that exact 
position in the file. A range over some sort of list, if you 
delete the current item in the list, should the range produce the 
previous item? The next item? null?

So, as a general rule, never use input ranges twice, and never 
use them after mutating the underlying data structure. Just 
recreate them if you want to do something twice, or use tee as 
mentioned above.


More information about the Digitalmars-d-learn mailing list