toString vs. toUtf8

Mon Nov 19 17:04:00 PST 2007

Gregor Richards wrote:
> Sean Kelly wrote:
>> Walter Bright wrote:
>>> Phobos (and D) has undergone some evolution in the thinking about 
>>> unicode strings, and it certainly has a few anachronisms in its 
>>> names. But I think we've evolved to the point where going forward, we 
>>> know what to do:
>>>
>>> char[] => string
>>> wchar[] => wstring
>>> dchar[] => dstring
>>>
>>> These are all unicode strings. Putting non-unicode encodings in them, 
>>> even temporarily, should be discouraged. Non-unicode encodings should 
>>> use ubyte[], ushort[], etc.
>>
>> This seems fair.  It would reinforce the idea that strings really do 
>> use a common encoding format, and that foreign encodings are relegated 
>> to a different form of transport.  Now if only toWString didn't look 
>> so horrible :-)
> 
> Worse looking than toUtf16? 

Yes.  I find the 'W' or 'D' in the middle of the name difficult to read. 
   It literally hurts my eyes to look at that particular word. 
Something about the single capital letter in the middle of the word as 
the distinguishing characteristic, and the fact that the 'W' and 'D' do 
not correlate to anything meaningful in English.  Didn't someone post 
recently that the mind is trained to recognize words by their first and 
last letter?  I tihnk its smoehtnig lkie taht.  With toUtf8, etc, I 
basically just see the trailing '8' and I know what it is.  Trying to 
pick out a 'W' or 'D' in the middle of a word is much more difficult, 
particularly since it is next to another capital letter.

> Would you prefer if int => int32, long => 
> int64, short => int16, byte => int8, real => float80 (portability be 
> damned), double => float64, float => float32? They'd certainly be more 
> obvious, but I can tell you I'd go crazy.

No, but I feel that this is an invalid comparison.  We are talking about 
function names concerning type transformations, not type names.

Sean