Rename std.string.toStringz?

Jonathan M Davis jmdavisProg at gmx.com
Sat Jun 25 15:41:43 PDT 2011


On 2011-06-25 15:15, kenji hara wrote:
> > 1. Keep toStringz as it is (as well as toUTF16z) and either consider
> > stringz to be some sort of word unique to the D community or just admit
> > that we're not going to camelcase it because it would break too much
> > code to do so.
> 
> ++vote, but not all.
> 
> Currently, the return type of toStringz is "zero-termniated UTF-8",
> not "C-string".
> 
> The 'C-string' word has multiple meanings=encodings. ASCII, Latin-1,
> EUC, Shift-JIS (in Japan), UTF-8 (Linux?), UTF-16 (in Windows) ...
> It depends on context.
> 
> But, maybe, many of ’C-string' equals to "zero-terminated UTF-8' or
> "zero-terminated UTF-16".
> Other encodings should be supported by another module (std.encoding?
> Is it living?).
> 
> My proposal:
> 1. Add three aliased types.
>     alias immutable(char)* stringz;       // useful in Linux
>     alias immutable(wchar)* wstringz;  // useful in Windows
>     alias immutable(dchar)* dstringz;   //
> 2. Rename current toStringz to toUTF8z, and add deprecated aliasing
> 'toStringz' to keep compatibility.
>     (Adding toUTF32z in std.string module will increase consistency.
> Templated toUTFXXz family is more better.)
> 3. std.conv.to support conversion from 'any string type' to
> (|wd)stringz type (by using toUTFXXz family).
> 
> The main point is we should make the aliased type names as 'De facto'
> type names, like string, wstring, dstring. (Remember the three string
> types are aliased type in fact.)
> 
> We can treat the type name uint as 'unsigned int'. Because it is just
> built-in type name!
> 
> User defined type names shoude be camel cased usually in D.
> Then, let's make them built-in! Therefore we can remove camel cased
> names from our choices.
> 
> I think this proposal is usefulness, keeping compatibility, and consistent.

From this and related discussions, it seems that the current plan is to create 
a toUTFz function which is templated on the pointer type that you want 
returned (char*, const(char)*, immutable(char)*, wchar*, etc.) and which takes 
any string type. Then you can get a zero-terminated string with whatever level 
of constness you want from any string. std.conv.to would then be updated such 
that converting from any string to any character pointer would call toUTFz. We 
may or may not have toStringz, toWstringz, and toDstringz which use toUTFz.

Regardless, I don't see much point in creating the types stringz, wstringz, 
and dstringz. There's nothing which guarantees that they're going to be zero-
terminated, so they could be complete misnomers, depending on how they're 
used, and they're specifically immutable whereas you often need mutable zero-
terminated strings. So, ultimately, I don't think that they'd add much. We 
_do_ need better conversion functions though.

- Jonathan M Davis


More information about the Digitalmars-d mailing list