Phobos strings versus C++ Boost

Michel Fortin michel.fortin at michelf.ca
Mon Jan 13 09:34:49 PST 2014


On 2014-01-13 17:15:21 +0000, "Dominikus Dittes Scherkl" 
<Dominikus.Scherkl at continental-corporation.com> said:

> On Sunday, 12 January 2014 at 12:48:05 UTC, Tobias Pankrath wrote:
>> On Saturday, 11 January 2014 at 21:42:46 UTC, Dmitry Olshansky wrote:
>>> 12-Jan-2014 01:22, monarch_dodra пишет:
>>> And it's indeed quite high, the amount of "bad sheep" that gets 
>>> longer/shorter across the whole Unicode is around 5-10 codepoints IRC.
>> 
>> More important than the absolute amount of "bad sheep" is the frequency 
>> of them in your input :-)
> 
> In german the frequency of "ß" is 0.31% and the mess with getting a longer
> result ("SS") is only for toUpper().
> I think greak has a similar problem but don't know the frequency there...

The funny thing about "ß" is that in UTF-8 it's two bytes (0xC3 0x9F) 
and you replace it with "SS" which is two bytes too (0x53 0x53). So 
with some cleverness it can be done in place for char[], but not for 
wchar[] or dchar[]. :-)

-- 
Michel Fortin
michel.fortin at michelf.ca
http://michelf.ca



More information about the Digitalmars-d mailing list