[Design] return char[] or string?

Sun Jul 29 14:15:52 PDT 2007

Stewart Gordon wrote:
> While I haven't got into using D 2.x, I've already begun thinking about 
> making libraries compatible with it.  On this basis, a design decision 
> to consider is whether functions that return a string should return it 
> as a char[] or a const(char)[].  (I use "string" with its general 
> meaning, and "const(char)[]" to refer to that specific type.  Obviously 
> for 1.0 compatibility, I'd have to use the "string" alias wherever I 
> want const(char)[].)
> 
> Obviously, a function that takes a string as a parameter has to take in 
> a const(char)[], to be able to accept a string literal or otherwise a 
> constant string.  But what about the return type?
> 
> Looking through the 2.x version of std.string, they all return 
> const(char)[] rather than char[].  (Except for those that return 
> something else such as a number.)  This is necessary in most cases 
> because of the copy-on-write policy.
> 
> But otherwise, it seems that both have their pros and cons.
> 
> There seem to be two cases to consider: libraries targeted specifically 
> at D 2.x, and libraries that (attempt to) support both 1.x and 2.x.  At 
> the moment, it's the latter that really matters.
> 
> Let's see.  The string-returning functions in my library more or less 
> fall into these categories:
> (a) functions that build a string in a local variable, which is then 
> returned
> (b) functions that return a copy of a member variable
> (c) property setters and the like that simply pass the argument through
> (d) functions that call a function in Phobos and return the result
> 
> In the case of (a), there is no obvious benefit to returning a 
> const(char)[] rather than a char[].
> 
> Many of the cases of (b) are property getters.  If we have such things 
> returning a const(char)[], then the getter no longer needs to copy the 
> member variable.  Though versioning would be needed to implement this 
> behaviour without causing havoc under 1.x.  The alternative, leaving 
> them returning char[], leads to inconsistency with (c), which would have 
> to return const(char)[].
> 
> That leaves (d), to which the obvious answer is to return whatever type 
> the Phobos function returns.
> 
> On one hand, if the string is generated on the fly, and so altering it 
> would not cause a problem, it seems wasteful to return a const(char)[] 
> only for the caller to have to .dup it if it wants to modify it.
> 
> On the other hand, from the library user's point of view, it can be seen 
> as a confusing inconsistency if some functions return char[] and others 
> const(char)[], when no difference in the semantics of what's returned 
> accounts for this.  It also borders on breaking the encapsulation 
> principle, whereby internal implementation details should not be exposed 
> in my library's API.
> 
> What do you people think?
> 
> Stewart.

It's a question of ownership. If the function is returning a new string, 
and giving ownership of that string to the caller, then it should return 
a char[]. If the function is returning a string which the caller is 
merely borrowing, it should return a const(char)[]. In most cases, 
thinking of things this way causes the return type to be obvious.

And, of course, you can always convert a char[] to a const(char)[].

In (a), the function is returning a new string to the caller; it should 
return char[].

(b) should usually return const(char)[], unless of course you want the 
caller to mutate the string. If you're going through the trouble of 
wrapping a member with a getter/setter, then that probably means you 
don't want the user messing with it directly.

The other cases are less clear, and will vary from function to function.

-- 
Kirk McDonald
http://kirkmcdonald.blogspot.com
Pyd: Connecting D and Python
http://pyd.dsource.org