phobos by ref or by value

Sun Dec 16 19:06:52 PST 2012

On Sunday, 16 December 2012 at 23:02:30 UTC, Jonathan M Davis 
wrote:
>
> You _don't_ take ranges by ref unless you want to alter the 
> original, which is
> almost never the case. Functions like popFrontN are the 
> exception. And since
> you _are_ going to mutate the parameter (since ranges iterate 
> via mutation),
> something like const ref would never make sense, even if it had 
> C++'s
> semantics. I'm not sure if auto ref screams at you if you try 
> and mutate the
> original, but if it doesn't, then you get problems when passing 
> it lvalue
> ranges, because they'd be being passed by ref and mutated, 
> which you don't
> want. So, auto ref makes no sense either. You pretty much 
> always pass ranges
> by value. And a range which does a deep copy when it's copied 
> is a
> fundamentally broken range anyway. It has the wrong semantics 
> and won't
> function correctly with many range-based functions. Ranges are 
> supposed to be
> a view into a range of values (possibly in a container), and 
> copying the view
> shouldn't copy the actual elements. Otherwise, you'd be doing 
> the equivalent
> of passing around a container by value, which is almost always 
> a horrible
> idea.
>
> As for types which aren't ranges, they're almost a non-issue in 
> Phobos. Most
> functions in Phobos take either a range or a primitive type. 
> There aren't very
> many user-defined types in Phobos which aren't ranges (e.g. the 
> types in
> std.datetime), but those that aren't ranges are generally 
> either small enough
> that trying to pass by const ref or auto ref doesn't buy you 
> much (if
> anything), or they're classes, in which case, it's a non-issue. 
> And almost
> every generic function in Phobos takes a range. So, functions 
> in Phobos almost
> always take their arguments by value.

I assume you are talking about functions other than lowerBound, 
upperBound, trisect.

> They'll use ref when it's required for
> the semantics of what they're doing, but auto ref on function 
> parameters is
> rare.

When would ref be required for semantics? I am asking this to 
learn the D way - so any guidelines are helpful. We have language 
spec and TDPL. Maybe we need another book or three in the vein of 
Meyers "50 Effective Ways".

Sorry, but I don't understand the focus on ranges. I know ranges 
are involved because lowerBound is a method on SortedRange. But I 
am asking why a member function of a range (i.e. lowerBound) 
takes its argument by value. I don't mind copies of ranges being 
made when needed - as I think they are "light copies" of 
pointers. But by value of type V in lowerBound performs 
unnecessary copy of the element of unknown size/complexity. The 
library can not know the cost of that *and* it can be avoided (I 
think). I thought ranges were a refinement or improvement on pair 
of iterators. So I have a range of items already existing in 
memory and I want to find all elements in the range less than 
some value of type V. I don't understand the choice of the V as 
opposed to 'ref const(V)'. What this does is cause the fire of 
postblits again and again on a non-phobos user defined struct - 
and I think they are needless. *find* or *lower_bound* in C++, 
for example, take the element to be found as 'const &' so copies 
are not made. Why is that not done here? If it is not an 
oversight, I have more to learn on how things work in D and 
therefore want a broader set of guidelines. I would think a 
guideline like: "In generic code always take generic types that 
are not known to be primitives or very small collections of 
pointers (like dynamic array, associative array) by reference 
since you can not know the cost of copying".

Usually the best place to learn the way of a language is studying 
its standard libraries, so that is what I am after - the why's of 
it.

Thanks
Dan