Tricky semantics of ranges & potentially numerous Phobos bugs

Mon Oct 15 22:09:05 PDT 2012

In an effort to locate the suspected Phobos bug that I ran into in my
new Cartesian product implementation, I looked over the code for
std.algorithm joiner more carefully, and noticed the following:

Given a range of ranges ror, it assigns ror.front to a struct member and
then calls ror.popFront() immediately. Then as the user calls the joined
range's popFront, it successively pops its local copy of ror.front until
it's empty, whereupon it assigns the new ror.front to the local copy
again, and so on.

While this works for array of arrays, it *doesn't* work for ranges that
overwrite ror.front when ror.popFront() is called.  For example, it
wouldn't work for stdin.byLine(), because the underlying buffer it
reused. Here's the proof:

Code:
	// Filename: test2.d
	import std.algorithm;
	import std.stdio;
	void main() {
		auto lines = stdin.byLine();
		writeln(lines.joiner);
	}

Compile & test:

	# Compile
	dmd test2.d

	# Echo three lines, "abc", "def", and "ghi", and feed it to
	# the program.
	(echo abc; echo def; echo ghi) | ./test2

	# This is the output:
	defghighi

As you can see, the output is mangled, because byLine() has reused the
line buffer before joiner.Result got to it.

Long story short: saving a local copy of range.front and then calling
range.popFront() may _invalidate_ the copy. So basically, either you
need a forward range and use .save to keep track of old values of
range.front, or you have to refrain from calling popFront() until you're
well and truly done with the current value of .front. While not
respecting this will work with many common ranges, it will also fail in
subtle ways when given other ranges.

The scary thing is, I see similar code like this all over Phobos. Does
this mean that most of std.algorithm may need to be revised to address
this issue? At the very least, it would seem that a code audit is in
order to weed out this particular issue.

:-/

T

-- 
A linguistics professor was lecturing to his class one day. "In
English," he said, "A double negative forms a positive. In some
languages, though, such as Russian, a double negative is still a
negative. However, there is no language wherein a double positive can
form a negative."
A voice from the back of the room piped up, "Yeah, yeah."