Questions about Unicode, particularly Japanese

Tue Jun 8 13:18:54 PDT 2010

Sorry, if it's again top post in your mail clients. I'll try to figure out what's going on later today.

> 
> 1. Am I correct in all of that?

Yes. That's the reason I was saying that UTF-16 is *NOT* a lousy encoding. It really depends on a situation. The advantage is not only space but also faster processing speed (even for 2 byte letters: Greek, Cyrillic, etc.) since those 2 bytes can be read at one memory access as opposed to UTF-8. Also, consider another thing: it's easier (and cheaper) to convert from ANSI to UTF-16 since a direct table can be created. Whereas for UTF-8, you'll have to do some shifts to create a surrogate for non-ASCII letters (even for Latin ones).

What encoding is better depends on your taste, language, applications, etc. I was simply pointing out that it's quite nice to have universal 'tchar' type. My argument was never about which encoding is better - it's hard to tell in general. Besides, many people still use ANSI and not UTF-8.