Replacing tango.text.Ascii.isearch
Ali Çehreli
acehreli at yahoo.com
Wed Oct 26 06:05:14 UTC 2022
On 10/25/22 22:49, Siarhei Siamashka wrote:
> Unicode is significantly simpler than a set of various
> incompatible 8-bit encodings
Strongly agreed.
> I'm surely
> able to ignore the peculiarities of modern Turkish Unicode
The problem with Unicode is its main aim of allowing characters of
multiple writing systems in the same text. When multiple writing systems
are in play, conflicts and ambiguities will appear.
> and wait for
> the other people to come up with a solution for D language if they
> really care.
I solved my problem by writing an Alphabet hierarchy in the past. I
don't like that code but it still works:
https://bitbucket.org/acehreli/ddili/src/4c0552fe8352dfe905c9734a57d84d36ce4ed476/src/alphabet.d#lines-50
It handles capitalization, ordering, etc. I use it when preparing the
Index section of the Turkish edition of "Programming in D":
http://ddili.org/ders/d/ix.html
One of the ambiguities is what came up on this thread: Should a word
that starts with I (capital i) be listed under I (because it's Turkish)
or under İ (because it's English)? So far, I am lucky because the only
word that starts with I happens to be the English "IDE", so it goes
under i (which appears as İ in the Turkish edition) and would make sense
to a Turkish reader because a Turkish reader might (really?) accept it
as the capital of ide.
It's confusing but it seems to work. :) It doesn't matter. Life is
imperfect and things will somehow work in the end.
Ali
More information about the Digitalmars-d-learn
mailing list