Should we add drop to Phobos?

Tue Aug 16 18:29:21 PDT 2011

On Wednesday, August 17, 2011 00:39:13 Jonathan M Davis wrote:
> On Tuesday, August 16, 2011 16:19 Dmitry Olshansky wrote:
> > On 17.08.2011 2:35, Jonathan M Davis wrote:
> > > On Tuesday, August 16, 2011 15:10 Dmitry Olshansky wrote:
> > >> I think forward ranges that are value types should just have ref
> > >> save{
> > >> return this; } that in the end should entail 0 copies(?).
> > >> Creating a simple function that does return move(x.save) on
> > >> ForwardRange and move(x) on others might be good idea to abstract
> > >> this discrepancy away.
> > 
> > While I agree on all of the above, I'm obviously not making myself clear
> > about avoiding copies.
> > 
> > > That would copy. Returning ref allows you to chain functions that
> > > take
> > > ref, and it allows you to alter the variable which is returned as
> > > long
> > > as you do it in a single expression without assigning it to
> > > anything.
> > > 
> > > Besides, that doesn't help in this case anyway. The problem is that
> > > save _isn't_ being called but a copy is still happening in the case
> > > of value-type ranges and arrays, whereas it isn't happening with
> > > reference-type ranges.
> > 
> > Let's deal with my strange "0 copies" point, in fact, it's about
> > ensuring exactly 1 copy is made:
> > 
> > void doStuff(R)(R range)//here struct is copied when passed, class is
> > passed by ref and not copied
> > {
> > auto r = range.save(); //here struct should avoid copy by
> > returning ref, while class object finally does create a copy
> > /// ... work with r
> > }
> > 
> > replacing range.save to simply range on non-forward ranges.
> 
> save needs to save every time that it's called, or it's not doing its job.
> If it doesn't, then you risk not having a copy when you thought that you
> did. So, even in concept, what you're suggesting here really isn't a good
> idea. But regardless,
> 
> auto r = range.save();
> 
> will copy even if save returned by ref, because you can't declare a variable
> with ref like that. Only function parameters, foreach declarations, and
> return values can be ref. You'd have to be able to declare
> 
> ref E r = range.save();
> 
> for that to work (where E is the element type), and that's not legal. I
> think that what we should probably do is have something like
> 
> static if(isForwardRange!R)
>  range = range.save;
> 
> at the beginning of any function which takes a forward range. If R is a
> class, then it fixes it so that the function's behavior is the same for
> classes as it is for structs. If it's a struct, as long as it doesn't
> declare a postblit constructor, I would fully expect the call to be
> optimized out, since ultimately, it's assigning itself to itself. The one
> case where it couldn't be optimized out would be if the struct had a
> postblit constructor, and odds are that if it has postblit constructor,
> it's actually reference type and not a value type anyway, so the save could
> would actually be needed - though I suppose that it might also not get
> optimized when a member variable of the struct (or a member variable of a
> member varible, etc.) has a postblit constructor, in which case you'd lose
> some efficiency. But that case would probably be quite abnormal.
> 
> The main problem here IMHO is the annoyance factor. It's easy to screw up
> (but then again, it's also easy to forget to call save, so being able to
> easily screw up some basics of forward ranges is nothing new). However,
> proper unit tests will catch it easily, so ultimately, I supposed that the
> main issue is just that it's more boilerplate code. I don't see any way
> around it though if we want all forward ranges to act the same with
> range-based functions. Otherwise, the reference-type ranges will be
> consumed as if they were only input ranges, while the value-type ranges
> won't be consumed.

Okay. I need to think more before I type. That bit about input ranges and 
postblit makes no sense. Input ranges almost certainly _don't_ have postblit 
constructors, and it wouldn't matter if they did for this situation, since 
they obviously wouldn't have save being called on them.

In any case, the basic logic still holds IMHO. In general, the cost of the 
extra save will be optimized out. The only real question as far as efficiency 
goes, I think, is how common it is for ranges to have postblit constructors. 
It's not often, I think.

I think that I'll create a new thread about this issue later this evening so 
that it'll be seen (and hopefully discussed) by more people and so that it's 
more likely that those who really need to see it will.

- Jonathan M Davis