Submission: updated std.uni module

Thomas Kuehne thomas-dloop at kuehne.cn
Fri Feb 24 06:22:40 PST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thomas Kühne schrieb am 2006-02-24:
> Attached is an updated std.uni module.
>
> publicly visible changes:
> 1) updated casing and isUniAlpha data to Unicode 5.0.0
>
> 2) added "dchar[] toUniLower(dchar[])" and "dchar[] toUniUpper(dchar[])"
> in order to handle cases like toUniUpper("\u00DF") -> "SS"
>
>
> internal changes:
> 3) use AAs instead of hardcoded IFs for upper and lower casing
> (I might expand the extractor to hardcode IFs, if anybody experiences
> serious performance degration.)

Unicode seems sometimes to be a collection of special cases ;)


Forgot to add:

The following characters aren't mapped correctly.
format: character (condition)


GREEK CAPITAL LETTER SIGMA (Final_Sigma)

==Lithuanian locale==
COMBINING DOT ABOVE (After_Soft_Dotted)
LATIN CAPITAL LETTER I (More_Above)
LATIN CAPITAL LETTER J (More_Above)
LATIN CAPITAL LETTER I WITH OGONEK (More_Above)
LATIN CAPITAL LETTER I WITH GRAVE
LATIN CAPITAL LETTER I WITH ACUTE
LATIN CAPITAL LETTER I WITH TILDE

==Turkish and Azeri locale==
LATIN CAPITAL LETTER I WITH DOT ABOVE
COMBINING DOT ABOVE (After_I)
LATIN CAPITAL LETTER I (Not_Before_Dot)
LATIN SMALL LETTER I


Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFD/yRl3w+/yD4P9tIRAoZMAJ9zgOksEiMjo083zSqShy98su4F8wCdGFZ/
LUq9xcrXIpRX5rs0IKXUKEg=
=NEac
-----END PGP SIGNATURE-----



More information about the Digitalmars-d mailing list