guidelines for parameter types

Mon Dec 17 22:34:54 PST 2012

Thank you very much for doing the hard work on this. I find this kind of 
information very important.

On 12/17/2012 12:46 PM, Dan wrote:
 > Assume V is a non-template parameter type and v is a parameter of that
 > type for any function. Also assume T is a template parameter type and t
 > is a parameter of that type for any function. Is the following table and
 > set of guidelines below reasonable? What other guidelines do you use or
 > would make sense to follow? I apologize if this is obvious/well known

I don't think this is well known at all. :) I have thought about these 
myself and came up with some guidelines at http://ddili.org/ders/d.en

To be honest, I already have doubts abouts some of those guidelines and 
I already know that some should be changed. For example, 'const ref' 
parameters may be better than 'in' parameters in many contexts.

 > and I consider myself new to D, so please address any of my
 > misconceptions. I'm finding that on the surface D sounds much simpler
 > than it is, but if I can get a good set of guidelines it should all work
 > out.
 >
 > Thanks,
 > Dan
 >
 > (COW means Copy on Write)

| convention              | what it means/when to use                 |
|-------------------------+-------------------------------------------|
| V v                     | V is primitive, dynamic arr, assoc array, |
|                         | COW, or kown copy cheap (< 16 bytes)      |

I don't know how practical it is but it would be nice if the price of 
copying an object could be considered by the compiler, not by the 
programmer.

According to D's philosophy structs don't have identities. If I pass a 
struct by-value, the compiler should pick the fastest method.

| const(V) v              | same as (V v), pedantic - makes copy and  |
|                         | guarantees no mutation in function        |

That's sensible. (In practice though, it is rarely done in C++. For 
example, if V is int and v is not intended to be modified, it is still 
passed in as 'V v'.)

| immutable(V) v          | No need - for ensuring no local changes   |
|                         | prefer 'const(V) v'                       |

That makes a difference whether V is a value type or not. (It is not 
clear whether you mean V is a value type.) Otherwise, e.g. 
immutable(char[]) v has a legitimate meaning: The function requires that 
the caller provides immutable data.

| ref V v                 | Use only when mutation of v is required   |

Agreed.

| ref const(V) v          | Indicate v will not be changed, accepts   |
|                         | {V, const(V), immutable(V)}               |

Agreed.

| ref immutable(V) v      | No need - restrictive with no benefit     |
|                         | over 'ref const(V) v'                     |

I still has a different meaning: You must have an immutable V and I need 
a reference to it. It may be that the identity of the object is 
important and that the function would store a reference to it.

| V* v                    | Use only when mutation of v is required.  |
|                         | Prefer 'ref V v' unless null significant  |
|                         | or unsafe manipulations desired           |

Agreed.

Also, pointers may be needed especially when interfacing with C and C++ 
libraries, but again, the D function can still take 'ref' and pass the 
address of that ref to the C function.

| const(V)* v             | Indicate v will not be changed,           |
|                         | accepts {V*, const(V)*, immutable(V)*}    |
|                         | still prefer ref unless null significant  |
|                         | or unsafe manipulations desired           |

Agreed.

| immutable(V)* v         | No need - restrictive with no benefit     |
|                         | over 'const(V)* v'                        |

Again, if the function demands immutable(V), which may be null, then it 
actually has some use.

| T t                     | T is primitive, dynamic array, or assoc   |
|                         | array (i.e. cheap/shallow copies). For    |
|                         | generic code no knowledge of COW or       |
|                         | cheapness so prefer 'ref T t'             |

I am not sure about that last guideline. I think we should simply type T 
and the compiler does its magic. I don't know how practical my hope is.

Besides, we don't know whether T is primitive or not. It can be 
anything. If T is int, 'ref T t' could actually be slower due to the 
pointer indirection due to ref.

| const(T) t              | same as (T t), pedantic - makes copy and  |
|                         | guarantees no mutation in function        |

Agreed.

| immutable(T) t          | No need - for ensuring no local changes   |
|                         | prefer 'const(V) v'                       |

Agreed.

| ref T t                 | Use only when mutation of t is required   |
|                         | prefer 'ref const(T) t' if mutation not   |
|                         | required                                  |

Agreed. Again though, about the last comment, maybe 'const(T) t' is 
better than 'ref const(T) t'.

| ref const(T) t          | Indicate t will not be changed, accepts   |
|                         | {T, const(T), immutable(T)} without copy  |

Agreed.

| ref immutable(T) t      | No need - restrictive with no benefit     |
|                         | over 'ref const(T) t'                     |

Again, there may be a use case.

| auto ref T t            | Use only when mutation of t required and  |
|                         | want support of by value for rvalues      |
|                         | (May be obviated in the long run)         |

I have to remind me about that one again.

| auto ref const(T) t     | Indicate t will not be changed, accepts   |
|                         | [lr]value {T, const(T), immutable(T)}     |
|                         | (May be obviated in the long run)         |

I have to remind me about that one again. :)

| auto ref immutable(T) t | No need - restrictive with no benefit     |
|                         | over 'auto ref const(T) t'                |

Same. :p

| T* t                    | Use only when mutation of t is required.  |
|                         | Prefer 'ref T t' unless null is           |
|                         | significant or dealing with unsafe code.  |

Agreed.

| const(T)* t             | Prefer 'ref const(T) t' unless            |
|                         | null is significant or dealing with       |
|                         | unsafe code                               |

Agreed.

| immutable(T)* t         | No need - restrictive with no benefit     |
|                         | over 'const(T)* t'                        |

To repeat myself: The function may require immutable data that may be null.

 > *** Parameter Type Guidelines ***
 >
 > - Use pointers when null has specific intended meaning or the function
 > wants unsafe code, otherwise prefer ref
 >
 > - Prefer const(T|V) to immutable(T|V) because const(T|V) is accepting of
 > mutables and it ensures they are not mutated. immutable(T|V), on the
 > other hand, only accepts immutables for types with aliasing and
 > therefore makes the function less applicable. This eliminates 7 rows
 > from consideration.
 >
 > - Always use const(T|V) when passing by ref if referred to instance is
 > not mutated. For the non template case (i.e. V) not using const(V) means
 > const and immutables can not be used as arguments. This unnecessarily
 > reduces the application of the function. This is a debatable guideline
 > for template types T, since the T being parameterized could be T =
 > const(S), so it does not prevent the function from being called with
 > const(S) or immutable(S). But the real problem with using just T instead
 > of const(T) in the signature of a function that does not mutate t is the
 > developer reading the signature has no way of knowing that T will not be
 > mutated without compiling and seeing if it breaks. It is as if important
 > user information is missing. So 'foo(T)(T t)' or 'foo(T)(ref T t)' may
 > both accept 'const(S) s', but only if the compiled code does not mutate
 > s. But without showing that guarantee to the compiler and developer with
 > signature like 'foo(T)(const(T) t)' or 'foo(T)(ref const(T) t)' you can
 > be setting yourself up for future problems. For example, if you go with
 > 'foo(T)(ref T t)', in the future you might (accidentally) add a mutating
 > call on t. Then all existing code that passed in const(S) would break
 > and if you test with only mutables you might not see the errors. An
 > example from phobos that violates this is formatValue when passing in a
 > struct. Even though the argument is not modified (why would it be) the
 > signature is has 'auto ref T val' instead of 'auto ref const(T) val'.
 >
 > - Prefer 'ref' to by value on all template parameters that are not
 > primitives, dynamic arrays or associative arrays - since there is no
 > knowledge of how expensive the copy will be.

I still think that the compiler should help me with that. Similar to how 
it applies automatic move semantics to value types. (Supposedly... I 
don't know how successful it is.)

 > (this is a guideline that
 > is violated by SortedRange.(lowerBound, upperBound, trisect)).

 >
 > - When to use 'auto ref' template parameters: 'auto ref T t' as a
 > parameter says - make one or two functions with different signatures and
 > same body based on how it is called. If called with rvalue use 'T t', if
 > called with lvalue use 'ref T t'. From this thread:
 > http://forum.dlang.org/thread/4F84D6DD.5090405@digitalmars.com?page=1

I have to read that thread again, this time more carefully. :)

 > it
 > sounds like this will no longer be necessary since in the future a
 > single signature of 'ref T t' will support both lvalue and rvalue.

Sounds great.

 > So,
 > for now it is best to start with 'ref T' instead of 'auto ref T' unless
 > there is really an interim need for support of passing literals/rvalues
 > into the function. One downside to 'auto ref' is it has the power to
 > combinatorially increase the number of instantiations of each function.
 > The upside is it allows rvalues to be passed in until the ultimate
 > solution is implemented. In generic code, if you don't mind requiring
 > lvalues of users, don't bother with auto parameters at all and stick
 > with 'ref'.

Not much experience with that one. I hope others will chime in... :)

Ali