Why UTF-8/16 character encodings?

Dmitry Olshansky dmitry.olsh at gmail.com
Sat May 25 10:13:39 PDT 2013


25-May-2013 13:05, Joakim пишет:
> On Saturday, 25 May 2013 at 08:42:46 UTC, Walter Bright wrote:
>> I think you stand alone in your desire to return to code pages.
> Nobody is talking about going back to code pages.  I'm talking about
> going to single-byte encodings, which do not imply the problems that you
> had with code pages way back when.

Problem is what you outline is isomorphic with code-pages. Hence the 
grief of accumulated experience against them.
>> Code pages simply are no longer practical nor acceptable for a global
>> community. D is never going to convert to a code page system, and even
>> if it did, there's no way D will ever convince the world to abandon
>> Unicode, and so D would be as useless as EBCDIC.
> I'm afraid you and others here seem to mentally translate "single-byte
> encodings" to "code pages" in your head, then recoil in horror as you
> remember all your problems with broken implementations of code pages,
> even though those problems are not intrinsic to single-byte encodings.
>
> I'm not asking you to consider this for D.  I just wanted to discuss why
> UTF-8 is used at all.  I had hoped for some technical evaluations of its
> merits, but I seem to simply be dredging up a bunch of repressed
> memories about code pages instead. ;)

Well if somebody get a quest to redefine UTF-8 they *might* come up with 
something that is a bit faster to decode but shares the same properties. 
Hardly a life saver anyway.
>
> The world may not "abandon Unicode," but it will abandon UTF-8, because
> it's a dumb idea.  Unfortunately, such dumb ideas- XML anyone?- often
> proliferate until someone comes up with something better to show how
> dumb they are.

Even children know XML is awful redundant shit as interchange format. 
The hierarchical document is a nice idea anyway.

> Perhaps it won't be the D programming language that does
> that, but it would be easy to implement my idea in D, so maybe it will
> be a D-based library someday. :)

Implement Unicode compression scheme - at least that is standardized.



-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list