isInputRange is false for non-copyable types

Mon Apr 16 04:15:36 UTC 2018

On 16/04/18 01:28, Jonathan M Davis wrote:
> The fact that foreach copies the range that it's given makes using ref with
> anything other than arrays a bit of an iffy proposition. It will compile
> regardless of whether front returns by ref or not (which arguably is a bug),
> but it will only affect the original elements if copying the range doesn't
> copy the elements

You keep referring to ranges that contain the data, but I am racking my 
mind and failing to find examples.

You can have ranges over arrays,linked lists, trees (though you'd 
probably use opApply there), and any other container, as well as 
external data sources (files, sockets, pipes). In none of those cases 
would your range contain actual data. I need to see the use case for 
ranges *with* embedded data.

Plus, as I've said before, if ranges are expected to contain data, 
copying them seems like an incredibly bad idea. The whole idea behind 
casually copying the range about is that it's cheap to do.

> which in the case of a basic input range rather than a
> forward range

I don't see how that statement is true.

> is generally true, otherwise, it would be a forward range.

A forward range is an input range.

> But as it stands, using ref with foreach in generic code is not going to go
> well. If anything, it's actually a code smell.

You are conflating ref with owning data, which I find unsubstantiated.

>> If non-copyable types were allowed, then functions that accept a range
>> would fail to compile on non-copyable types if they do a copy.
>>
>> But the flip side is that if non-copyable would be allowed, a lot of
>> functions that current copy would not. There are far too many
>> unnecessary copies in the code that serve no purpose, but they are
>> invisible, so they are not fixed.
> 
> And if a templated function fails to compile if it's given a particular
> argument, and that failure is not due to the template constraint, that
> usually means that the template constraint is incomplete.

I do not accept this argument outright. In fact, our experience at Weka 
has drove us to the exact opposite direction.

We deliberately set up functions to include invalid cases and then 
either let them fail because of lacking constraints, or even set up a 
static assert to fail the function generation rather than fail matching.

We found that the error messages produced by the alternative are simply 
too difficult to parse out and figure what went wrong.

With that said, there is some merit to setting up the function 
constraints so that it matches what the function actually need. It 
allows overloading the missing combinations by another module. I don't 
think it is a good argument in this particular case, however.

I think we should set up to document which phobos functions require 
copying. It will flesh out bugs and unnecessary copying in phobos, as 
well as force phobos developers to think about this scenario when they 
write code. It shouldn't even take that much time to do this run: just 
go over all UTs and add instantiation over a non-copyable type. The UTs 
that fail need work.

> it would mean that _every_ range-based function then has to worry about
> non-copyable elements

No, just those in libraries. The same way they have to worry about big O 
complexity, type safety, @nogc and a bunch of other things.

When you write a library, you have to think about your users. This 
doesn't change anything about that.

> So, allowing
> non-copyable types complicates range-based code across the board.

Like I said above, it is more likely to find bugs than to introduce 
needless complications. If you code cannot be easily implemented over 
non-copyable types, exclude them. Don't shut me out of the entire game.

> Not all algorithms can call front just once without
> copying it

Huh? Where did that come from?

Phobos already assumes that calling front needs to be an O(1) operation. 
You can call front as many times as you like. Why would you assume 
otherwise?

> and many ranges (e.g. map) calculate front on each call, meaning
> that repeated calls to front can be more expensive than just calling it once
> and copying the result.

Not by a lot, and definitely not in any way the generic code has any 
right to assume. I reject this argument outright.

> The result of all of this is that in generic code, you have no clue
> whatsoever what the semantics are of copying a range around, because it
> depends entirely on the copying semantics of whatever range that code is
> currently being instantiated with. As such, generic code basically has to
> assume that once you copy a range, you can't mutate the original (since it
> may or may not be affected by mutating the copy and may or may not be
> cleanly affected by mutating the copy),

Jonathan, I'm sorry, but what you are describing is called "a bug".

Input ranges provide no guarantees about copying the range itself. If 
you do it and expect *anything* (which it seems you do), you have a bug. 
If you need to copy the range itself, you absolutely need to require a 
forward range.

A forward range with ref front absolutely promises that both iterations 
reference the same elements.

I find myself sincerely hoping that you are making these examples up in 
order to win this discussion turned into an argument for some reason, 
instead of the way phobos is actually written, because if you're right 
phobos is *broken*.

The good news is that if Phobos is broken, this is a great way to flush 
out the places it's broken. Let's add UTs calling all Phobos functions 
accepting input ranges (as opposed to forward ranges) with non-copyable 
ranges (i.e. - the range itself is non-copyable) and fix the compilation 
errors. This shouldn't take more than a few days to do.

Hell, let's write UTs to pass forward ranges that are non-copyable (but 
return a copy via "save") and flush out those bugs as well.

The point of "development" is to make our code better, not defend the 
status-quo. Especially since we've just identified a fairly simple way 
to find the places that need work and easily work on them.

> But of course, since _most_ ranges act just like save when they're copied,
> save gets frequently forgotten, making it a bit of a thorn in our side.

Then rejoice: I just suggested a way to *easily* find and fix those places.

> require that forward ranges act like value types (at
> least insofar as copying them copies the iteration state and thus acts like
> save).

The brackets are important. Just because I'm returning a reference to 
elements does not mean I don't support the range itself restarting a scan.

> So, some redesign of ranges would be great.

Yes, but please note that I don't think it is *necessary*. Yes, save is 
annoying, but it *is* workable as is, and since we can harness the 
compiler to locate where we're deviating from the righteous way, not 
that expensively if we could just be bothered to do it.

>> And allowing ranges with uncopyable members would turn this potential
>> deadly run-time bugs into easy to find compile time errors. Sound like a
>> great reason to allow it, to me.
> 
> Ranges have to be copyable.

And why would a range iterating over uncopyable objects not be copyable? 
How are the two related?

Shachar