Unicode's proper level of abstraction? [was: Re: VLERange:...]
Michel Fortin
michel.fortin at michelf.com
Thu Jan 13 05:47:58 PST 2011
On 2011-01-13 06:48:46 -0500, spir <denis.spir at gmail.com> said:
> Note that D's stdlib currently provides no means to do this, not even
> on the fly. You'd have to interface with eg ICU (a C/C++/Java Unicode
> library) (good luck ;-). But even ICU, as well as supposed
> unicode-aware typse or librarys for any language, would give you an
> abstraction producing correct results for Michel's example. For
> instance, Python3 code fails as miserably as any other. AFAIK, D is the
> first and only language having such a tool (Text.d at
> https://bitbucket.org/denispir/denispir-d/src/a005424f60f3).
D is not the first language dealing correctly with Unicode strings in
this manner. Objective-C's NSString class search and compare methods
deal with characters with combining marks correctly. If you want to
compare code points, you can do so explicitly using the NSLiteralSearch
option, but the default is to compare the canonical version (at the
grapheme level).
<http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/Strings/Articles/SearchingStrings.html%23//apple_ref/doc/uid/20000149-CJBBGBAI>
In
Cocoa, string sorting and case-insensitive comparition is also
dependent on the user's locale settings, although you can also specify
your own locale if the user's locale is not what you want.
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list