[Issue 5221] entity.c: Merge Walter's list with Thomas'
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Mon Jan 31 01:11:17 PST 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5221
--- Comment #15 from Don <clugdbug at yahoo.com.au> 2011-01-31 01:08:57 PST ---
The DMD test suite chokes on:
〈 == 9001 (== U+2329), in the new list it is U+27E8.
This really scared me, because I found a few web references that listed 〈
as U+2329.
http://www.fileformat.info/info/unicode/char/2329/index.htm
Turns out that U+2329 and U+27E8 are visually almost identical.
I found this helpful note in
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#endnote_lang
lang: 'mathematical left angle bracket' is NOT the same character as U+003C
'less than', or U+2039 'single left-pointing angle quotation mark', or U+2329
'left-pointing angle bracket', or U+3008 'left angle bracket'.
I finally found what has happened: U+27E8 was added in unicode 3.2.0
In the book "unicode explained", p423, it says that U+27E8 is poorly supported
(because it was a recent addition to unicode) and that U+2329 is a more
practical choice. But, U+2329 is canonically equivalent to U+3008, and is
intended for chinese-japanese-korean ideographs, and it can look wrong if it
goes through a normalization process.
That book was published in 2006. Can we assume that unicode support is
widespread enough now that we should change to the more correct value?
--
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
More information about the Digitalmars-d-bugs
mailing list