guidelines for parameter types

Tue Dec 18 04:51:31 PST 2012

On Tuesday, 18 December 2012 at 06:34:55 UTC, Ali Çehreli wrote:
> I don't think this is well known at all. :) I have thought 
> about these myself and came up with some guidelines at 
> http://ddili.org/ders/d.en

Thanks - I will study it. I see that you have covered also in, 
out, inout, lazy, scope, and shared, so that should keep me busy 
for a while.

> I don't know how practical it is but it would be nice if the 
> price of copying an object could be considered by the compiler, 
> not by the programmer.

I agree - would be nice if compiler could do it but if it tried 
some would just not be happy about the choices, no matter what.

>
> According to D's philosophy structs don't have identities. If I 
> pass a struct by-value, the compiler should pick the fastest 
> method.
>

Even if there is a postblit? Maybe that would work, but say your 
object were a reference counting type. If the compiler decided to 
pass by ref sneakily for performance gain when you think it is by 
value that might be a problem. Maybe not, though, as long as you 
know how it works. I have seen that literal structs passed to a 
function will not call the postblit - but Johnathan says this was 
a bug in the way the compiler classifies literals.

>
> That's sensible. (In practice though, it is rarely done in C++. 
> For example, if V is int and v is not intended to be modified, 
> it is still passed in as 'V v'.)
>

Absolutely. I read somewhere it was pedantic to do such things. 
Then I read some other articles that touted the benefit, even on 
an int, because the reader of (void foo(const int x) {...} ) 
knows x will/should not change, so it has clearer intentions for 
future maintainers.

>
> That makes a difference whether V is a value type or not. (It 
> is not clear whether you mean V is a value type.) Otherwise, 
> e.g. immutable(char[]) v has a legitimate meaning: The function 
> requires that the caller provides immutable data.

When is 'immutable(char[]) v' preferable to 'const(char[]) v'? If 
you select 'const(char[]) v' instead, your function will not 
mutate v and if it is generally a useful function it will even 
accept 'char[]' that *is* mutable. I agree with the meaning you 
suggest, but under what circumstances is it important to a 
function to know that v is immutable as opposed to simply const?

>
> | ref immutable(V) v | No need - restrictive with no benefit|
> |                    | over 'ref const(V) v'                |
>
> I still has a different meaning: You must have an immutable V 
> and I need a reference to it. It may be that the identity of 
> the object is important and that the function would store a 
> reference to it.
>

This may be a use-case for it. You want to store a reference to v 
and save it for later - so immutable is preferred over const. I 
may be mistaken but I thought the thread on 'rvalue references' 
talks about taking away the rights to take the address of any ref 
parameter: http://forum.dlang.org/post/4F863629.6000407@erdani.com

> | V* v      | Use only when mutation of v is required.  |
> |           | Prefer 'ref V v' unless null significant  |
> |           | or unsafe manipulations desired           |
>
> Agreed.
>
> Also, pointers may be needed especially when interfacing with C 
> and C++ libraries, but again, the D function can still take 
> 'ref' and pass the address of that ref to the C function.
>

By 'unsafe manipulations' I meant things such as low level memory 
management, interfacing with C and such. It may be that in the 
future you will not be able to take the address of any 'ref' 
parameter (see previous link). So, if you know you are going to 
do interfacing with C or pointer work it and other non-safe code 
it may be best to just take 'V* v'.

>
> Again, if the function demands immutable(V), which may be null, 
> then it actually has some use.

I agree - I just don't know yet when a function would demand 
'immutable(V)' over 'const(V)'.

>
> | T t     | T is primitive, dynamic array, or  assoc   |
> |         | array (i.e. cheap/shallow copies). For     |
> |         | generic code no knowledge of COW or        |
> |         | cheapness so prefer 'ref T t'              |
>
> I am not sure about that last guideline. I think we should 
> simply type T and the compiler does its magic. I don't know how 
> practical my hope is.
>
> Besides, we don't know whether T is primitive or not. It can be 
> anything. If T is int, 'ref T t' could actually be slower due 
> to the pointer indirection due to ref.

Agreed. In a separate thread 
http://forum.dlang.org/thread/opufykfxwkkjchqcwgrg@forum.dlang.org 
I included some timings of passing a struct as 'in S', 'in ref 
S', and 'const ref S'. The very small sizes, matching up to sizes 
of primitives, showed litte if any benefit of by value over ref. 
Maybe the test/benchmark was flawed? But for big sizes, the by 
reference clearly won by a large margin. The problem with 
template code is you don't have any knowledge and the cost of 'by 
value' is unbounded, whereas difference between 'int t' and 'ref 
const(int) t' might be small. For instance, I don't like that 
SortedRange.lowerBound is creating many copies of the input 
object while doing its binary search on the data. Well, I suppose 
you could do:

void foo(T)(T t) if(isPrimitive!T ||
                     isDynamicArray!T ||
                     isAssociativeArray!T) {
...
}

void foo(T)(ref const(T) t) if(!isPrimitive!T &&
                                !isDynamicArray!T &&
                                !isAssociativeArray!T) {
...
}

You are right that compiler magic could help here.

>
> I still think that the compiler should help me with that. 
> Similar to how it applies automatic move semantics to value 
> types. (Supposedly... I don't know how successful it is.)
>

Agreed.

Thanks,
Dan