Fix Phobos dependencies on autodecoding

Walter Bright newshound2 at digitalmars.com
Fri Aug 16 21:05:44 UTC 2019


On 8/16/2019 9:32 AM, xenon325 wrote:
> On Thursday, 15 August 2019 at 22:23:13 UTC, Walter Bright wrote:
>> And yet somehow people manage to read printed material without all these 
>> problems.
> 
> If same glyphs had same codes, what will you do with these:
> 
> 1) Sort string.
> 
> In my phone's contact lists there are entries in russian, in english and mixed.
> Now they are sorted as:
> A (latin), B (latin), C, А (ru), Б, В (ru).
> Wich is pretty easy to search/navigate.

Except that there's no guarantee that whoever entered the data used the right 
code point.

The pragmatic solution, again, is to use context. I.e. if a glyphy is surrounded 
by russian characters, it's likely a russian glyph. If it is surrounded by 
characters that form a common russian word, it's likely a russian glyph.

Of course it isn't perfect, but I bet using context will work better than 
expecting the code points to have been entered correctly.

I note that you had to tag В with (ru), because otherwise no human reader or OCR 
would know what it was. This is exactly the problem I'm talking about.

Writing software that relies on invisible semantic information is never going to 
work.


More information about the Digitalmars-d mailing list