toStringz and toUTFz potentially unsafe
Jonathan M Davis
jmdavisProg at gmx.com
Sun Jul 24 19:09:17 PDT 2011
On Sunday 24 July 2011 21:57:34 Johann MacDonagh wrote:
> On 7/24/2011 9:06 PM, Jonathan M Davis wrote:
> > On Sunday 24 July 2011 17:56:04 Jonathan M Davis wrote:
> > The real question is what to do with to!(char*)(str). The plan is to
> > make it call toUTFz, but at that point, the warning about toUTFz is not
> > as obvious (though it can re-iterate the warning or point you to the
> > toUTFz documentation to read it). Also, since you already have toUTFz,
> > calling to!(char*) is kind of pointless. So, I think that there's a
> > good argument for forcing to!(char*) to append '\0' instead of checking
> > one past the end. Then when you want a guarantee that the '\0' isn't
> > going to change, you can use to!(char*), and if you want the
> > efficiency, you can call toUTFz. But it is debatable whether we should
> > do that or just have to!(char*) call toUTFz in all cases. I'm leaning
> > towards making it always copy though.
> >
> > - Jonathan M Davis
>
> In that case, maybe we should implement @schveiguy's suggestion.
>
> immutable(char)* toStringz(string s, bool unsafe = true) pure nothrow
>
> That way the user can decide whether to take the optimization risk (or
> if they know the string is on the stack, etc...). In addition, always
> copying is wasteful. We're usually able to append a NULL to a dynamic
> array without relocation / copying.
If you always append a '\0', then there's no point to toStringz at all. So,
sure we _could_ add the unsafe parameter like that, but I seriously question
the value of it. The primary value in toStringz is to give you the
optimization of looking one past the end of the string and attempting to
completely avoid any chance of reallocation.
And yes, you'd use ~= which _might_ copy rather than forcing a copy every
time, but the cases where you could have just checked one past the end of the
array and done nothing if it were '\0' are generally going to be the cases
where ~= has to reallocate. So, in reality, you're pretty much going to copy
every time that you could have avoided the copy if you had gone with true for
unsafe rather than false.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list