Updating D beyond Unicode 2.0

Thu Sep 27 07:35:28 UTC 2018

On Wednesday, 26 September 2018 at 20:43:47 UTC, Walter Bright 
wrote:
> On 9/26/2018 5:46 AM, Steven Schveighoffer wrote:
>> This is a non-starter. We can't break people's code, 
>> especially for trivial reasons like 'you shouldn't code that 
>> way because others don't like it'. I'm pretty sure Walter 
>> would be against removing Unicode support for identifiers.
>
> We're not going to remove it, because there's not much to gain 
> from it.
>
> But expanding it seems of vanishingly little value. Note that 
> each thing that gets added to D adds weight to it, and it needs 
> to pull its weight. Nothing is free.
>
> I don't see a scenario where someone would be learning D and 
> not know English. Non-English D instructional material is 
> nearly non-existent. dlang.org is all in English. Don't most 
> languages have a Romanji-like representation?

It's not that they don't know English. It's that non-English 
speakers can process words and sentences in non-English much more 
efficiently than in English. Knowing a language is not binary.

Here's an example from this years spring semester and NTNU 
(norwegian uni): 
http://folk.ntnu.no/frh/grprog/eksempel/eks_20.cpp

... That's the basic programming course. Whether the professor 
would use that I guess would depend on ratio of 
English/non-English speakers. But it's there nonetheless.

Of course Norway is a bad example because the English level here 
is, arguably, higher than many English countries :p But it's a 
great example because even if you're great at English, still 
sometimes people are more comfortable/confident/efficient/ in 
their own native language.

Some tech meetups from different countries try and do things in 
English and mostly it works. But it's been seen consistently with 
non-English audiences that presentations given in English result 
in silence whereas if it's in their native language you have 
actual engagement.

I fail to understand how supporting a version of unicode from 
(not sure when it was released) 3 billion decades ago should just 
be left as is and also cannot be removed when there's someone 
who's willing to update it.

>
> C/C++ have made efforts in the past to support non-ASCII coding 
> - digraphs, trigraphs, and alternate keywords. They've all 
> failed miserably. The only people who seem to know those 
> features even exist are language lawyers.

This is not relevant. Trigraphs and digraphs did indeed fail 
miserably but they do not represent any non-ascii characters. The 
existential reasons for those abominations were different.

Anyway, on a related note: D itself (not identifiers, but std) 
also supports unicode 6 or something. That's from 2010. That's a 
decade ago. We're at unicode 11 now. And I've already had someone 
tell me (while trying to get them to use D) - "hold on it 
supports unicode from a decade ago? Nah I'm not touching it". Not 
that it's the same as supporting identifiers in code, but still 
the reaction is relevant.

Cheers,
- Ali