Major performance problem with std.array.front()

Vladimir Panteleev vladimir at thecybershadow.net
Sun Mar 9 08:35:53 PDT 2014


On Sunday, 9 March 2014 at 13:00:46 UTC, monarch_dodra wrote:
> As for "the belief that iterating by code point has utility." I 
> have to strongly disagree. Unicode is composed of codepoints, 
> and that is what we handle. The fact that it can be be encoded 
> and stored as UTF is implementation detail.

But you don't deal with Unicode. You deal with *text*. Unless you 
are implementing Unicode algorithms, code points solve nothing in 
the general case.

> Seriously, Bearophile suggested "ABCD".sort(), and it took 
> about 6 pages (!) for someone to point out this would be wrong.

Sorting a string has quite limited use in the general case, so I 
think this is another artificial example.

> Even Walter pointed out that such code should work. *Maybe* it 
> is still wrong in regards to graphemes and normalization, but 
> at *least*, the result is not a corrupted UTF-8 stream.

I think this is no worse than putting all combining marks all 
clustered at the end of the string, thus attached to the last 
non-combining letter.


More information about the Digitalmars-d mailing list