Phobos strings versus C++ Boost
Michel Fortin
michel.fortin at michelf.ca
Mon Jan 13 09:34:49 PST 2014
On 2014-01-13 17:15:21 +0000, "Dominikus Dittes Scherkl"
<Dominikus.Scherkl at continental-corporation.com> said:
> On Sunday, 12 January 2014 at 12:48:05 UTC, Tobias Pankrath wrote:
>> On Saturday, 11 January 2014 at 21:42:46 UTC, Dmitry Olshansky wrote:
>>> 12-Jan-2014 01:22, monarch_dodra пишет:
>>> And it's indeed quite high, the amount of "bad sheep" that gets
>>> longer/shorter across the whole Unicode is around 5-10 codepoints IRC.
>>
>> More important than the absolute amount of "bad sheep" is the frequency
>> of them in your input :-)
>
> In german the frequency of "ß" is 0.31% and the mess with getting a longer
> result ("SS") is only for toUpper().
> I think greak has a similar problem but don't know the frequency there...
The funny thing about "ß" is that in UTF-8 it's two bytes (0xC3 0x9F)
and you replace it with "SS" which is two bytes too (0x53 0x53). So
with some cleverness it can be done in place for char[], but not for
wchar[] or dchar[]. :-)
--
Michel Fortin
michel.fortin at michelf.ca
http://michelf.ca
More information about the Digitalmars-d
mailing list