toString vs. toUtf8
David B. Held
dheld at codelogicconsulting.com
Tue Nov 20 01:00:01 PST 2007
Sean Kelly wrote:
> [...]
> I don't suppose there is anyone who does a lot of internationalization
> programming who can comment on the utility of one convention vs. the
> other? I would love to hear some more practical concerns regarding the
> naming convention for these functions.
I certainly don't qualify as someone who does a "lot" of i18n
programming, but I do some. Regardless, I would have to say that when I
see a function called toUtfXX(), I think "Oh, that must convert a string
from Latin-1 or something", rather than "Oh, that must give me the
UTF-XX representation of an object".
Perl is a bad example because it didn't get righteous UTF-8 support
until 5.8, but whenever you see "utf8" or similar in a Perl program, it
almost invariably involves an encoding/decoding operation. Perhaps it
is worth noting that whenever you see "UTF-8" in Java, is most likely
has to do with encoding/decoding. And the same is true of C#, etc.
So it appears that the precedent is that for most other languages, when
"UTF-8" is spelled out explicitly, it is usually in a transcoding
context. I don't think toWString() is an ideal name, but it seems to
have the right connotations to the naive programmer.
Dave
More information about the Digitalmars-d
mailing list