Unicode handling comparison
Wyatt
wyatt.epp at gmail.com
Wed Nov 27 08:15:52 PST 2013
On Wednesday, 27 November 2013 at 14:45:32 UTC, David Nadlinger
wrote:
>
> If you need to perform this kind of operations on Unicode
> strings in D, you can call normalize (std.uni) on the string
> first to make sure it is in one of the Normalization Forms. For
> example, just appending .normalize to your strings (which
> defaults to NFC) would make the code produce the "expected"
> results.
>
Seems like a pretty big "gotcha" from a usability standpoint;
it's not exactly intuitive. I understand WHY this decision was
made, but it feels like a source of code smell and weird string
comparison errors.
> As far as I'm aware, this behavior is the result of a
> deliberate decision, as normalizing strings on the fly isn't
> really cheap.
>
I don't remember if it was brought up before, but this makes me
wonder if something like an i18nString should exist for cases
where it IS important. Making i18n stuff as simple as it looks
like it "should" be has merit, IMO. (Maybe there's even room for
a std.string.i18n submodule?)
-Wyatt
More information about the Digitalmars-d
mailing list