[Issue 12455] [uni][reg] Bad lowercase mapping for 'LATIN CAPITAL LETTER I WITH DOT ABOVE'
via Digitalmars-d-bugs
digitalmars-d-bugs at puremagic.com
Sat Apr 19 06:52:32 PDT 2014
https://issues.dlang.org/show_bug.cgi?id=12455
--- Comment #2 from monarchdodra at gmail.com ---
I toyed around. The issue (apparently) is that it *can* be converted as:
LATIN CAPITAL LETTER I (U+0049)
COMBINING DOT ABOVE (U+0307)
As such, when converted to lower case, it becomes:
LATIN SMALL LETTER I (U+0049)
COMBINING DOT ABOVE (U+0307)
EG:
//----
import std.uni, std.stdio, std.string, std.conv;
void main()
{
auto c = 'İ'; // '\U0130' LATIN CAPITAL LETTER I WITH DOT ABOVE
auto s = "İ"; // '\U0130' LATIN CAPITAL LETTER I WITH DOT ABOVE
assert(std.uni.isUpper(c)); //Passes
auto sl = std.uni.toLower(s).to!dstring;
assert(sl == "\u0069\u0307"); //PASSES
}
//----
Because uni "thinks" the lowercase doesn't fit in a single dchar, it simply
does nothing (as documeted).
However, it's still wrong, as the standard (from what I read), is pretty clear
on the fact that the lower case is simply 'i'.
Furthermore, "LATIN SMALL LETTER I + COMBINING DOT ABOVE" is pretty
redundant...
--
More information about the Digitalmars-d-bugs
mailing list