D2 toStringz Return Type
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Fri Nov 7 13:26:14 PST 2008
Steven Schveighoffer wrote:
> "Andrei Alexandrescu" wrote
>> Steven Schveighoffer wrote:
>>> "Andrei Alexandrescu" wrote
>>>> Steven Schveighoffer wrote:
>>>>> "Andrei Alexandrescu" wrote
>>>>>> Mike Parker wrote:
>>>>>>> I'm curious as to why toStringz in D2 returns const(char)* instead of
>>>>>>> just a plain char*. Considering that the primary use case foe the
>>>>>>> function is interfacing with C code, it seems rather pointless to
>>>>>>> return a const(char)*.
>>>>>> We want to leave the opportunity open to not duplicate the actual
>>>>>> memory underneath the string object. (Right now that opportunity is
>>>>>> not effected.)
>>>>> My recommendation -- have 2 functions. One which always copies (and
>>>>> returns char *), and one which does not.
>>>>>
>>>>> This at least leaves a safe alternative for people who have headers
>>>>> that aren't properly constified, and don't want to go through the
>>>>> hassle of looking it up themselves. Also good for those C functions
>>>>> which actually require a mutable char *, since D2 strings are mostly
>>>>> invariant.
>>>> You can't quite do that because dynamic conditions establish whether
>>>> it's safe to avoid copying or not.
>>> I can see how you interpreted it this way.
>>>
>>> What I meant was one is the toStringz as it is today, which might copy
>>> and might leave it in-place. This can be used to call C functions that
>>> take a const char *. The other function will *always* copy, and will
>>> return a mutable char *. This is for when you don't care to look at the
>>> function yourself (assuming the author got it correct), or the case where
>>> the C function actually does mutate the argument.
>>>
>>> If the C function does actually require a mutable argument, you are
>>> forced to do an extra dup for no reason with today's toStringz.
>>>
>>> -Steve
>> I see. So:
>>
>> const(char)* toStringzMayOrMayNotCopy(in char[]);
>> char* toStringzWillAlwaysCopy(in char[]);
>>
>> Providing writable zero-terminated strings is a sure recipe for disaster
>> (see the debates around sprintf, strcpy etc.). I think the need for
>> such things is rare and at best avoided entirely by the standard
>> library. If you so wish, you can always use malloc by hand.
>
> Using zero terminated strings, even const ones, is a recipe for disaster.
> Yet, there it is.
Well writable ones are even more of a disaster. Reading random
characters can cause the program to fail but does not corrupt its state
arbitrarily. So it's good to limit the damage. The C and C++ communities
have much more beef with writable stringz's than read-only ones.
> And it's making me do 2 duplications.
Not at all.
string s = ...;
auto sz = cast(char*) malloc(s.length + 1);
sz[0 .. s.length] = s[];
sz[s.length] = 0;
If you use it often in an application, put it in a function. I'm not
putting it in the standard library.
> The reality is that as soon as you cross the boundary from D to C, you have
> lost all the safety benefits that D provides, even if the signature is
> const.
I disagree. You lost automatic checking from the D side when interfacing
with C, but if a C function is reliably not mutating its arguments its D
signature is better tagged as const. It's a net win.
> The reality is, people are still going to call these functions,
> either with an extra dup (which buys you nothing in safety), or by editing
> the bindings to be const (which makes it even more unsafe). The reality is,
> most of these calls are pretty innocuous. People aren't using sprintf or
> strcpy, they are using C libraries that do things that D doesn't already do.
> Most of these are just using char * as a way to pass const strings, it isn't
> too much to ask for a function that complies.
Maybe I got lucky, but I haven't run across any C libraries that don't
use const in signatures. Anyhow the point is superfluous as you, not
them, gets to write the D interfacing signatures. Const conveys a world
of information. True, that is not 100% enforceable in D and in C alike,
as a cast could always ruin things. But it's good if the signature
reflects a guarantee that is reasonable and also reasonably easy to observe.
> But you probably won't add it. That's ok, I don't use Phobos anyways. I'll
> be sure to add an appropriate function to Tango while porting it to D2.
You may want to rethink before putting dangerous functions in
widely-used libraries. Returning a writable zero-terminated char* is as
dangerous as it gets, and fostering bad coding style too.
Andrei
More information about the Digitalmars-d
mailing list