[Design] return char[] or string?

Sun Jul 29 13:54:48 PDT 2007

While I haven't got into using D 2.x, I've already begun thinking about 
making libraries compatible with it.  On this basis, a design decision to 
consider is whether functions that return a string should return it as a 
char[] or a const(char)[].  (I use "string" with its general meaning, and 
"const(char)[]" to refer to that specific type.  Obviously for 1.0 
compatibility, I'd have to use the "string" alias wherever I want 
const(char)[].)

Obviously, a function that takes a string as a parameter has to take in a 
const(char)[], to be able to accept a string literal or otherwise a constant 
string.  But what about the return type?

Looking through the 2.x version of std.string, they all return const(char)[] 
rather than char[].  (Except for those that return something else such as a 
number.)  This is necessary in most cases because of the copy-on-write 
policy.

But otherwise, it seems that both have their pros and cons.

There seem to be two cases to consider: libraries targeted specifically at D 
2.x, and libraries that (attempt to) support both 1.x and 2.x.  At the 
moment, it's the latter that really matters.

Let's see.  The string-returning functions in my library more or less fall 
into these categories:
(a) functions that build a string in a local variable, which is then 
returned
(b) functions that return a copy of a member variable
(c) property setters and the like that simply pass the argument through
(d) functions that call a function in Phobos and return the result

In the case of (a), there is no obvious benefit to returning a const(char)[] 
rather than a char[].

Many of the cases of (b) are property getters.  If we have such things 
returning a const(char)[], then the getter no longer needs to copy the 
member variable.  Though versioning would be needed to implement this 
behaviour without causing havoc under 1.x.  The alternative, leaving them 
returning char[], leads to inconsistency with (c), which would have to 
return const(char)[].

That leaves (d), to which the obvious answer is to return whatever type the 
Phobos function returns.

On one hand, if the string is generated on the fly, and so altering it would 
not cause a problem, it seems wasteful to return a const(char)[] only for 
the caller to have to .dup it if it wants to modify it.

On the other hand, from the library user's point of view, it can be seen as 
a confusing inconsistency if some functions return char[] and others 
const(char)[], when no difference in the semantics of what's returned 
accounts for this.  It also borders on breaking the encapsulation principle, 
whereby internal implementation details should not be exposed in my 
library's API.

What do you people think?

Stewart.