[Issue 5221] entity.c: Merge Walter's list with Thomas'

d-bugmail at puremagic.com d-bugmail at puremagic.com
Mon Jan 31 01:11:17 PST 2011


http://d.puremagic.com/issues/show_bug.cgi?id=5221



--- Comment #15 from Don <clugdbug at yahoo.com.au> 2011-01-31 01:08:57 PST ---
The DMD test suite chokes on:
⟨ == 9001 (== U+2329), in the new list it is U+27E8.

This really scared me, because I found a few web references that listed ⟨
as U+2329.
http://www.fileformat.info/info/unicode/char/2329/index.htm

Turns out that U+2329 and U+27E8 are visually almost identical.

I found this helpful note in
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#endnote_lang

lang: 'mathematical left angle bracket' is NOT the same character as U+003C
'less than', or U+2039 'single left-pointing angle quotation mark', or U+2329
'left-pointing angle bracket', or U+3008 'left angle bracket'.

I finally found what has happened: U+27E8 was added in unicode 3.2.0
In the book "unicode explained", p423, it says that U+27E8 is poorly supported
(because it was a recent addition to unicode) and that U+2329 is a more
practical choice. But, U+2329 is canonically equivalent to U+3008, and is
intended for chinese-japanese-korean ideographs, and it can look wrong if it
goes through a normalization process.

That book was published in 2006. Can we assume that unicode support is
widespread enough now that we should change to the more correct value?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list