Major performance problem with std.array.front()
Vladimir Panteleev
vladimir at thecybershadow.net
Sun Mar 9 08:35:53 PDT 2014
On Sunday, 9 March 2014 at 13:00:46 UTC, monarch_dodra wrote:
> As for "the belief that iterating by code point has utility." I
> have to strongly disagree. Unicode is composed of codepoints,
> and that is what we handle. The fact that it can be be encoded
> and stored as UTF is implementation detail.
But you don't deal with Unicode. You deal with *text*. Unless you
are implementing Unicode algorithms, code points solve nothing in
the general case.
> Seriously, Bearophile suggested "ABCD".sort(), and it took
> about 6 pages (!) for someone to point out this would be wrong.
Sorting a string has quite limited use in the general case, so I
think this is another artificial example.
> Even Walter pointed out that such code should work. *Maybe* it
> is still wrong in regards to graphemes and normalization, but
> at *least*, the result is not a corrupted UTF-8 stream.
I think this is no worse than putting all combining marks all
clustered at the end of the string, thus attached to the last
non-combining letter.
More information about the Digitalmars-d
mailing list