Rename std.string.toStringz?

kenji hara k.hara.pg at gmail.com
Sat Jun 25 16:07:02 PDT 2011


2011/6/26 Jonathan M Davis <jmdavisProg at gmx.com>:
> On 2011-06-25 15:15, kenji hara wrote:
>> > 1. Keep toStringz as it is (as well as toUTF16z) and either consider
>> > stringz to be some sort of word unique to the D community or just admit
>> > that we're not going to camelcase it because it would break too much
>> > code to do so.
>>
>> ++vote, but not all.
>>
>> Currently, the return type of toStringz is "zero-termniated UTF-8",
>> not "C-string".
>>
>> The 'C-string' word has multiple meanings=encodings. ASCII, Latin-1,
>> EUC, Shift-JIS (in Japan), UTF-8 (Linux?), UTF-16 (in Windows) ...
>> It depends on context.
>>
>> But, maybe, many of ’C-string' equals to "zero-terminated UTF-8' or
>> "zero-terminated UTF-16".
>> Other encodings should be supported by another module (std.encoding?
>> Is it living?).
>>
>> My proposal:
>> 1. Add three aliased types.
>>     alias immutable(char)* stringz;       // useful in Linux
>>     alias immutable(wchar)* wstringz;  // useful in Windows
>>     alias immutable(dchar)* dstringz;   //
>> 2. Rename current toStringz to toUTF8z, and add deprecated aliasing
>> 'toStringz' to keep compatibility.
>>     (Adding toUTF32z in std.string module will increase consistency.
>> Templated toUTFXXz family is more better.)
>> 3. std.conv.to support conversion from 'any string type' to
>> (|wd)stringz type (by using toUTFXXz family).
>>
>> The main point is we should make the aliased type names as 'De facto'
>> type names, like string, wstring, dstring. (Remember the three string
>> types are aliased type in fact.)
>>
>> We can treat the type name uint as 'unsigned int'. Because it is just
>> built-in type name!
>>
>> User defined type names shoude be camel cased usually in D.
>> Then, let's make them built-in! Therefore we can remove camel cased
>> names from our choices.
>>
>> I think this proposal is usefulness, keeping compatibility, and consistent.
>
> From this and related discussions, it seems that the current plan is to create
> a toUTFz function which is templated on the pointer type that you want
> returned (char*, const(char)*, immutable(char)*, wchar*, etc.) and which takes
> any string type. Then you can get a zero-terminated string with whatever level
> of constness you want from any string. std.conv.to would then be updated such
> that converting from any string to any character pointer would call toUTFz. We
> may or may not have toStringz, toWstringz, and toDstringz which use toUTFz.
>
> Regardless, I don't see much point in creating the types stringz, wstringz,
> and dstringz. There's nothing which guarantees that they're going to be zero-
> terminated, so they could be complete misnomers, depending on how they're
> used, and they're specifically immutable whereas you often need mutable zero-
> terminated strings. So, ultimately, I don't think that they'd add much. We
> _do_ need better conversion functions though.
>
> - Jonathan M Davis
>

> There's nothing which guarantees that they're going to be zero-
> terminated, so they could be complete misnomers, depending on how they're
> used,
Ah, you are right. I didn't think about it. I agree to you.

> to create
> a toUTFz function which is templated on the pointer type that you want
> returned (char*, const(char)*, immutable(char)*, wchar*, etc.)
I tihnk the templated function toUTFz needs default type inference
feature like follows:
----
string s = "...";
auto sz = toUTFz(s);
static assert(is(typeof(sz) == immutable(char)*));
----

Thanks for your explain.

Kenji


More information about the Digitalmars-d mailing list