[Issue 18241] New: Missing characters from std.uni.unicode.Default_Ignorable_Code_Point
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Mon Jan 15 23:50:32 UTC 2018
https://issues.dlang.org/show_bug.cgi?id=18241
Issue ID: 18241
Summary: Missing characters from
std.uni.unicode.Default_Ignorable_Code_Point
Product: D
Version: D2
Hardware: x86_64
OS: Linux
Status: NEW
Severity: normal
Priority: P1
Component: phobos
Assignee: nobody at puremagic.com
Reporter: hsteoh at quickfur.ath.cx
The set returned by unicode.Default_Ignorable_Code_Point is missing some
characters listed in:
http://www.unicode.org/L2/L2002/02368-default-ignorable.pdf
where Default_Ignorable_Code_Point is defined as:
Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space)
While characters in Other_Default_Ignorable_Code_Point seem to be included
correctly, two characters in Cf appear to be missing from the set:
- U+06DD
- U+070F
Furthermore, characters in (Cc - White_Space) are also missing:
- U+0000 to U+0008
- U+000E to U+001F
(See also: PR #5, referencing the Unicode Standard section 5.22.)
Not sure if this is because these missing characters were added in a later
Unicode standard than was originally implemented in std.uni.
--
More information about the Digitalmars-d-bugs
mailing list