std.algorithm.remove and principle of least astonishment

Michel Fortin michel.fortin at michelf.com
Mon Nov 22 04:34:15 PST 2010


On 2010-11-22 06:57:36 -0500, spir <denis.spir at gmail.com> said:

> (*) Actually, once one a has a string of <graphemes/codes/code-units>, rout
> ines are the same whatever the kind of element. There could be a generic ve
> rsion in std.string.

Just to add to the compexity: graphemes aren't always equivalent to 
user-perceived characters either. Ligatures can contain more than one 
user-perceived characters. If you're looking for the substring 
"flourish" in a string, should it fail to match when it encounters 
"flourish" just because of the "fl" (fl) ligature? On most Mac 
applications it matches both thanks to sensible defaults in NSString's 
search and comparison algorithms.

So perhaps we need yet another layer over graphemes to represent 
user-perceived characters.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



More information about the Digitalmars-d mailing list