toString vs. toUtf8

Regan Heath regan at netmail.co.nz
Tue Nov 20 02:04:34 PST 2007


Sean Kelly wrote:
> Gregor Richards wrote:
>> Sean Kelly wrote:
>>> Walter Bright wrote:
>>>> Phobos (and D) has undergone some evolution in the thinking about 
>>>> unicode strings, and it certainly has a few anachronisms in its 
>>>> names. But I think we've evolved to the point where going forward, 
>>>> we know what to do:
>>>>
>>>> char[] => string
>>>> wchar[] => wstring
>>>> dchar[] => dstring
>>>>
>>>> These are all unicode strings. Putting non-unicode encodings in 
>>>> them, even temporarily, should be discouraged. Non-unicode encodings 
>>>> should use ubyte[], ushort[], etc.
>>>
>>> This seems fair.  It would reinforce the idea that strings really do 
>>> use a common encoding format, and that foreign encodings are 
>>> relegated to a different form of transport.  Now if only toWString 
>>> didn't look so horrible :-)
>>
>> Worse looking than toUtf16? 
> 
> Yes.  I find the 'W' or 'D' in the middle of the name difficult to read. 
>   It literally hurts my eyes to look at that particular word. Something 
> about the single capital letter in the middle of the word as the 
> distinguishing characteristic, and the fact that the 'W' and 'D' do not 
> correlate to anything meaningful in English.  Didn't someone post 
> recently that the mind is trained to recognize words by their first and 
> last letter?  I tihnk its smoehtnig lkie taht.  With toUtf8, etc, I 
> basically just see the trailing '8' and I know what it is.  Trying to 
> pick out a 'W' or 'D' in the middle of a word is much more difficult, 
> particularly since it is next to another capital letter.

I agree, I think I'd prefer:

toString
toStringW
toStringD

or

toString
toString16
toString32

maybe with an alias for toString to toStringA, and/or toString8.


There is some precedent as Unicode versions of windows functions have a 
trailing W, i.e. CreateFileA, CreateFileW

Regan



More information about the Digitalmars-d mailing list