.sort and .reverse break utf8 encoding
Walter Bright
newshound at digitalmars.com
Wed Oct 4 19:54:11 PDT 2006
Sean Kelly wrote:
> Changing the behavior of .reverse kind of makes sense, but I don't
> understand the reason for changing .sort aside from consistency.
> Personally, I've never had a reason to sort a char array in the first
> place unless the chars were intended to represent something other than
> their lexical meaning. And that aside, sorting chars in a string
> without a comparison predicate will do so using the char's binary value,
> which has no lexical significance beyond the 26 letters of the English
> alphabet (as represented in ASCII). I'm starting to feel like people
> are harping on Unicode issues just for the sake of doing so rather than
> because these are actual problems. Can someone please explain what I'm
> missing?
A use for it is collecting character usage frequency statistics is one
such. Read a text file into a buffer, sort the buffer, and dump the result!
I don't mind the harping on it. Getting the details right is important,
even if the details themselves aren't. Besides, it's an easy fix.
More information about the Digitalmars-d-bugs
mailing list