Rename std.string.toStringz?

Mon Jun 20 08:51:40 PDT 2011

On 2011-06-20 06:23, Andrei Alexandrescu wrote:
> On 6/20/11 7:23 AM, Steven Schveighoffer wrote:
> > On Sun, 19 Jun 2011 09:20:17 -0400, Andrei Alexandrescu
> > 
> > <SeeWebsiteForEmail at erdani.org> wrote:
> >> On 6/18/11 5:42 PM, Jonathan M Davis wrote:
> >>> On 2011-06-18 06:35, Andrei Alexandrescu wrote:
> >>>> On 6/18/11 4:59 AM, Jonathan M Davis wrote:
> >>>>> I'll look at renaming toUTF16z to toWStringz to match toStringz (as
> >>>>> was
> >>>>> suggested by a couple of people in this thread)
> >>>> 
> >>>> That should be a template toUTFz that takes either char*, wchar*, or
> >>>> dchar*.
> >>> 
> >>> A good point. Are you arguing that toStringz should be replaced by
> >>> such a
> >>> construct? Or that it should simply exist in addition to toStringz?
> >>> Also, we _could_ make it so that such a template would take the
> >>> mutabality of
> >>> the pointer as well (e.g. toUTF!(char*)(str), toUTF!(const(char)*),
> >>> etc.),
> >>> which would allow it to be used in cases where you actually want a
> >>> mutable
> >>> string (which toStringz doesn't do).
> >>> 
> >>> - Jonathan M Davis
> >> 
> >> I think that's a good idea, which would address that StackOverflow
> >> problem too.
> >> 
> >> The way I'd probably suggest we go about it is as a universal
> >> transcoder. Define std.conv.to with strings of any width and
> >> qualification as input and with pointers to characters of any width as
> >> output. It is implied that the conversion entails adding a terminating
> >> zero.
> >> 
> >> string a = "hello";
> >> auto p = to!(wchar*)(a); // change width and qualifier
> > 
> > I don't like relying on an implication is a zero character is added. A
> > char * pointer may or may not be zero terminated (that is one of the
> > issues with C), so you can't really designate a type to mean "zero
> > terminated".
> 
> Technically you're right. Yet I think it's pretty widespread that a sole
> char* means a zero-terminated string.

I don't know. I can see it being argued either way. I don't know why anyone 
would would use a char*, wchar*, dchar*, etc. except for passing to C 
functions. But the lack of explicitness could be a problem. And there would be 
no guarantee that all character pointers are zero-terminated strings, which 
could cause problems if it's assumed that they are. So, I don't know.

I suppose that we could just go the route of doing both. std.conv.to could 
call toUTFz in the case of casts to char*, wchar*, dchar*, etc. It's not 
exactly ideal, but then you can be either explicit or implicit. But we 
generally try and avoid doing that sort of thing...

- Jonathan M Davis