Updating D beyond Unicode 2.0
Abdulhaq
alynch4047 at gmail.com
Sun Sep 23 19:06:26 UTC 2018
On Saturday, 22 September 2018 at 08:52:32 UTC, Jonathan M Davis
wrote:
>
> Honestly, I was horrified to find out that emojis were even in
> Unicode. It makes no sense whatsover. Emojis are supposed to be
> sequences of characters that can be interepreted as images.
> Treating them like Unicode symbols is like treating entire
> words like Unicode symbols. It's just plain stupid and a clear
> sign that Unicode has gone completely off the rails (if it was
> ever on them). Unfortunately, it's the best tool that we have
> for the job.
According to the Unicode website,
http://unicode.org/standard/WhatIsUnicode.html,
"""
Support of Unicode forms the foundation for the representation of
languages and symbols in all major operating systems, search
engines, browsers, laptops, and smart phones—plus the Internet
and World Wide Web (URLs, HTML, XML, CSS, JSON, etc.)"""
Note, unicode supports symbols, not just characters.
The smiley face symbol predates its ':-)' usage in ascii text,
https://www.smithsonianmag.com/arts-culture/who-really-invented-the-smiley-face-2058483/. It's fundamentally a symbol, not a sequence of characters. Therefore it is not unreasonable for it to be encoded with a unicode number. I do agree though, of course, that it would seem bizarre to use an emoji as a D identifier.
The early history of computer science is completely dominated by
cultures who use latin script based characters, and hence, quiet
reasonably, text encoding and its automated visual representation
by compute based devices is dominated by the requirements of
latin script languages. However, the world keeps turning and,
despite DT's best efforts, China et al. look to become dominant.
Even if not China, the chances are that eventually a non-latin
script based language will become very important. Parochial views
like "all open source code should be in ASCII" will look silly.
However, until that time D developers have to spend their time
where it can be most useful. Hence the condition of whether to
apply Neia's patch / ideas or not mainly depends on how much
effort the donwstream effort will be (debuggers etc. as Walter
pointed out), and how much the gain is. As unicode 2.0 is already
supported I would take a guess that the vast majority of people
with access to a computer can already enter identifiers in D that
are rich enough for them. As Adam said though, it would be a good
idea to at least ask!
More information about the Digitalmars-d
mailing list