Tricky semantics of ranges & potentially numerous Phobos bugs

Tue Oct 16 10:28:38 PDT 2012

On Tuesday, October 16, 2012 10:13:57 H. S. Teoh wrote:
> I wasn't talking about compiler validation. I was talking about clearly
> defining, in the docs or otherwise, what exactly a range is, and what is
> expected of it. Right now, it's rather nebulous what exactly constitutes
> a range. I thought byLine() was a perfectly valid range, but apparently
> you think otherwise. The two aren't compatible, since they lead to wrong
> code that has buggy behaviour when passed something it doesn't expect.

ByLine is perfectly valid range insofar as you realize that it's likely to go 
completely south if you use it in any way that could involve keeping front 
around after popFront has been called, which means that anything which relies 
on keeping front around isn't going to work. So, it's a range, but it's 
essentially an unsafe one (though I'm not sure that it's an un- at safe one).

So, it's fine that ByLine is a range as long as we're willing to put up with it 
not working with a lot of range-based functions because of its abnormal 
behavior. But I don't think that it's at all reasonable for range-based 
functions in general to not be able to rely on front returning the same type 
every time or on its value disappearing or becoming invalid in some way after 
a call to popFront. That's completely untenable IMHO.

Ranges _can_ define semantics which violate that, but they have to make it 
clear that they do so that programmers using them realize that they may not 
work right with a lot of range-based functions (which potentially makes it so 
that it they really shouldn't have been ranges in the first place).

> So what is (or should be) the semantic design of ranges? Let's work out
> a precise definition so that we have something to build on.

As far as front (or back) goes, range-based functions should be able to rely 
on

1. front returning the exact same value on every call until popFront has been 
called (though there's no guarantee that front won't have to be recalculated 
on each call, so assigning the result of front to a local variable is 
advisable for efficiency if front must be used multiple times before a call to 
popFront).

2. the result of front continuing to be valid and unchanged after popFront has 
been called if it was assigned to a variable.

Any range is free to violate this, but because range-based functions are free 
to rely on it, such ranges risk not working correctly with many range-based 
functions and must be used with caution.

- Jonathan M Davis