The Case Against Autodecode

Sat Jun 4 01:12:47 PDT 2016

On 6/3/2016 11:17 PM, H. S. Teoh via Digitalmars-d wrote:
> On Fri, Jun 03, 2016 at 08:03:16PM -0700, Walter Bright via Digitalmars-d wrote:
>> It works for books.
> Because books don't allow their readers to change the font.

Unicode is not the font.

> This madness already exists *without* Unicode. If you have a page with a
> single glyph 'm' printed on it and show it to an English speaker, he
> will say it's lowercase M. Show it to a Russian speaker, and he will say
> it's lowercase Т.  So which letter is it, M or Т?

It's not a problem that Unicode can solve. As you said, the meaning is in the 
context. Unicode has no context, and tries to solve something it cannot.

('m' doesn't always mean m in english, either. It depends on the context.)

Ya know, if Unicode actually solved these problems, you'd have a case. But it 
doesn't, and so you don't :-)

> If you're going to represent both languages, you cannot get away from
> needing to represent letters abstractly, rather than visually.

Books do visually just fine!

> So should O and 0 share the same glyph or not? They're visually the same
> thing,

No, they're not. Not even on old typewriters where every key was expensive. Even 
without the slash, the O tends to be fatter than the 0.

> The very fact that we distinguish between O and 0, independently of what
> Unicode did/does, is already proof enough that going by visual
> representation is inadequate.

Except that you right now are using a font where they are different enough that 
you have no trouble at all distinguishing them without bothering to look it up. 
And so am I.

> In other words toUpper and toLower does not belong in the standard
> library. Great.

Unicode and the standard library are two different things.