toString vs. toUtf8

David B. Held dheld at codelogicconsulting.com
Tue Nov 20 01:00:01 PST 2007


Sean Kelly wrote:
> [...]
> I don't suppose there is anyone who does a lot of internationalization 
> programming who can comment on the utility of one convention vs. the 
> other?  I would love to hear some more practical concerns regarding the 
> naming convention for these functions.

I certainly don't qualify as someone who does a "lot" of i18n 
programming, but I do some.  Regardless, I would have to say that when I 
see a function called toUtfXX(), I think "Oh, that must convert a string 
from Latin-1 or something", rather than "Oh, that must give me the 
UTF-XX representation of an object".

Perl is a bad example because it didn't get righteous UTF-8 support 
until 5.8, but whenever you see "utf8" or similar in a Perl program, it 
almost invariably involves an encoding/decoding operation.  Perhaps it 
is worth noting that whenever you see "UTF-8" in Java, is most likely 
has to do with encoding/decoding.  And the same is true of C#, etc.

So it appears that the precedent is that for most other languages, when 
"UTF-8" is spelled out explicitly, it is usually in a transcoding 
context.  I don't think toWString() is an ideal name, but it seems to 
have the right connotations to the naive programmer.

Dave



More information about the Digitalmars-d mailing list