Fix Phobos dependencies on autodecoding

H. S. Teoh hsteoh at quickfur.ath.cx
Tue Aug 13 18:44:09 UTC 2019


On Tue, Aug 13, 2019 at 06:24:23PM +0000, jmh530 via Digitalmars-d wrote:
> On Tuesday, 13 August 2019 at 16:58:38 UTC, Jonathan M Davis wrote:
> > [snip]
> > 
> > It's not on the e in both of them. It's on the e on the second line
> > of the "expected" output, but it's on the T in the second line of
> > the "actual" output.
> > 
> > - Jonathan M Davis
> 
> On my machine & browser, it looks like it is on the e on both.

Probably what Jonathan said about the browser munging the Unicode.
Unicode is notoriously hard to process correctly, and I wouldn't be
surprised if the majority of applications out there actually don't
handle it correctly in all cases.

The whole auto-decoding deal is a prime example of this: even an expert
programmer like Andrei fell into the wrong assumption that code point ==
grapheme. I have no confidence that less capable programmers, who form
the majority of today's programmers and write the bulk of the industry's
code, are any more likely to get it right.  (For years I myself didn't
even know there was such a thing as "graphemes".)  In fact, almost every
day I see "enterprise" code that commits atrocities against Unicode --
because QA hasn't thought to pass a *real* Unicode string as test input
yet. The day the idea occurs to them, a LOT of code (and I mean a LOT)
will need to be rewritten, probably from scratch.


T

--
"Real programmers can write assembly code in any language. :-)" -- Larry
Wall


More information about the Digitalmars-d mailing list