Fix Phobos dependencies on autodecoding
Patrick Schluter
Patrick.Schluter at bbox.fr
Fri Aug 16 10:32:06 UTC 2019
On Friday, 16 August 2019 at 09:34:21 UTC, Walter Bright wrote:
> On 8/16/2019 2:20 AM, Patrick Schluter wrote:
>> Sorry, no it didn't work in reality before Unicode. Multi
>> language system were a mess.
>
> I have several older books that move facilely between multiple
> languages. It's not a mess.
>
> Since the reader can figure all this out without invisible
> semantic information in the glyphs, that invisible information
> is not necessary.
Unicode's purpose is not limited to the output at the end the
processing chain. It's the whole processing chain that is the
point.
>
> Once you print/display the Unicode string, all that semantic
> information is gone. It is not needed.
As said, printing is only a minor part of language processing. To
give an example from the EU again, and just to illustrate, we
have exactly three laser printer (one is a photocopier) on each
floor of our offices. You may say; o you're the IT guys, you
don't need to print that much, to which I respond, half of the
floor is populated with the english translation unit and while
they indeed use the printers more than us, it is not a
significant part of their workflow.
>
>
>> Unicode works much, much better than anything that existed
>> before. The issue is that not a lot of people work in a
>> multi-language environment and don't have a clue of the unholy
>> mess it was before.
>
> Actually, I do. Zortech C++ supported multiple code pages,
> multiple multibyte encodings, and had error messages in 4
> languages.
Each string was in its own language. We have to deal with texts
that are mixed languages. Sentences in Bulgarian with an office
address in Greece, embedded in a xml file. Codepages don't work
in that case, or you have to introduce an escaping scheme much
more brittle and annoying than utf-8 or utf-16 encoding.
European Parliament's session logs are what is called panaché
documents, i.e. the transcripts are in native language of
intervening MEP's. So completely mixed documents.
>
> Unicode, in its original vision, solved those problems.
Unicode is not perfect and indeed the crap with emoji is crap,
but Unicode is better than what was used before.
And to insist again, Unicode is mostly about "DATA PROCESSING".
Sometime it might result to a human readable result, but that is
only one part of its purpose.
More information about the Digitalmars-d
mailing list