Why UTF-8/16 character encodings?

Walter Bright newshound2 at digitalmars.com
Sat May 25 01:42:20 PDT 2013


On 5/25/2013 12:33 AM, Joakim wrote:
> At what cost?  Most programmers completely punt on unicode, because they just
> don't want to deal with the complexity. Perhaps you can deal with it and don't
> mind the performance loss, but I suspect you're in the minority.

I think you stand alone in your desire to return to code pages. I have years of 
experience with code pages and the unfixable misery they produce. This has 
disappeared with Unicode. I find your arguments unpersuasive when stacked 
against my experience. And yes, I have made a living writing high performance 
code that deals with characters, and you are quite off base with claims that 
UTF-8 has inevitable bad performance - though there is inefficient code in 
Phobos for it, to be sure.

My grandfather wrote a book that consists of mixed German, French, and Latin 
words, using special characters unique to those languages. Another failing of 
code pages is it fails miserably at any such mixed language text. Unicode 
handles it with aplomb.

I can't even write an email to Rainer Schütze in English under your scheme.

Code pages simply are no longer practical nor acceptable for a global community. 
D is never going to convert to a code page system, and even if it did, there's 
no way D will ever convince the world to abandon Unicode, and so D would be as 
useless as EBCDIC.

I'm afraid your quest is quixotic.


More information about the Digitalmars-d mailing list