string and char[] in Phobos

Jonathan M Davis via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri Mar 18 13:06:27 PDT 2016


On Friday, March 18, 2016 08:24:24 Puming via Digitalmars-d-learn wrote:
> Hi,
>
> I saw from the forum that functions with string like arguments
> better use `in char[]` instead of `string` type, because then it
> can accept both string and char[] types.
>
> But recently when actually using D, I found that many phobos
> functions/constructors use `string`, while many returns `char[]`,
> causing me to do a lot of conv.to!string. And many times I have
> to fight with the excessive template error messages.
>
> Is there a reason to use `string` instead of `in char[]` in
> function arguments? Do you tend to change those phobos functions?

When a function accepts const(char)[] than it can accept char[],
const(char)[], const(char[]), immutable(char)[], and immutable(char[]),
which, whereas if it accepts string, then all it accepts are
immutable(char)[] and immutable(char[]). So, it's more restrictive, but if
you need to return a slice of the array you passed in, if your function
accepts const rather than mutable or immutable, then the slice has to be
const, and you've lost the type information, which is why inout exists -
e.g. if you have inout(char)[] and return the array that you're given, then
its constness doesn't change. But inout only works when you return the same
type as you pass in, and the function needed to store the string somewhere
(e.g. this if it were a property function for setting a member variable),
then accepting string would make more sense if it stores string. Otherwise,
it would have to allocate a new string (and storing const(char)[] would risk
having it change after it was passed to the function).

So, the exact constness that should be used depends heavily on what the
function is doing. Ali's dconf 2013 talk discusses some of these issues:
http://dconf.org/2013/talks/cehreli.html

Most functions in Phobos that operate on strings actually are templatized so
that they work with varying constness and character type - either that, or
they're templatized and operate on arbitrary ranges and not arrays
specifically at all. That avoids most of these issues but does mean that the
function needs to be templated.

I don't know what you're using in Phobos that takes string and returns
char[]. That implies an allocation, and if the function is pure, char[] may
have been selected, because it could be implicitly converted to string
thanks to the fact that the compiler could prove that the char[] being
returned had to have been allocated in the function and that there could be
no other references to that array. But without knowing exactly which
functions you're talking about, I can't really say. In general though, the
solution that we've gone with is to templatize functions that operate on
strings, and a function that's taking a string explicitly is most likely
storing it, in which case, it needs an explicit type, and using an immutable
value ensures that it doesn't change later.

If you want better insight into what the functions you're referring to do
and why, then you'll need to be specific about which ones you're talking
about.

In any case, in general, the approach that Phobos takes is to operate on
ranges of characters and only occasionally uses arrays specifically - except
in cases where the value needs to be stored, in which case, string is
typically what's used. It used to be that explicit strings were used more,
but we've been moving to using ranges as much as possible, so actually
seeing string in Phobos should be fairly rare and getting rarer.

On a side note, I'd strongly argue against using "in" on function arguments
that aren't delegates. in is equivalent to const scope, and scope currently
does nothing for any types other than delegates - but it might later, in
which case, you could be forced to change your code, depending on the exact
semantics of scope for non-delegates. But it does _nothing_ now with
non-delegate types regardless, so it's a meaningless attribute that might
change meaning later, which makes using it a very bad idea IMHO. Just use
const if you want const and leave scope for delegates. I'd actually love to
see in deprecated, because it adds no value to the language (since it's
equivalent to const scope, which you can use explicitly), and it hides the
fact that scope is used.

- Jonathan M Davis



More information about the Digitalmars-d-learn mailing list