[Issue 9173] New: std.string.wrap should conform to Unicode line-breaking algorithm
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Mon Dec 17 13:24:09 PST 2012
http://d.puremagic.com/issues/show_bug.cgi?id=9173
Summary: std.string.wrap should conform to Unicode
line-breaking algorithm
Product: D
Version: D2
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Phobos
AssignedTo: nobody at puremagic.com
ReportedBy: hsteoh at quickfur.ath.cx
--- Comment #0 from hsteoh at quickfur.ath.cx 2012-12-17 13:24:08 PST ---
Currently, there are some issues with std.string.wrap:
1) It uses std.uni.isWhite as criterion for line-breaking opportunities, but
isWhite includes such things as non-breaking space, which should *not* be
wrapped. It also includes things like vowel mark separators, which shouldn't be
wrapped, either.
2) It does not take zero-width characters and combining diacritics into account
when counting columns, which means that it will sometimes wrap the line at the
wrong place.
3) It does not wrap CJK text or Thai text correctly.
For reference, here's the Unicode technical reference that describes proper
line-breaking of Unicode text:
http://www.unicode.org/reports/tr14/
(After having read through TR14, I was in awe at how insanely complicated
line-wrapping in Unicode is. So I'd propose that, if nothing else, we should
fix items (1) and (2) above, which should be within the reach of a relatively
simple-to-implement European-centric line wrapping algorithm. People who want
CJK wrapping or other complicated stuff probably want to be writing their own
algo anyway.)
--
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
More information about the Digitalmars-d-bugs
mailing list