Compiler benchmarks for an alternative to std.uni.asLowerCase.

Dmitry Olshansky via Digitalmars-d digitalmars-d at puremagic.com
Thu May 12 10:58:45 PDT 2016


On 12-May-2016 17:23, Marco Leise wrote:
> Am Wed, 11 May 2016 11:37:03 +0000
> schrieb Marc Schütz <schuetzm at gmx.net>:
>
>> On Monday, 9 May 2016 at 08:44:53 UTC, Dmitry Olshansky wrote:
>>> On 09-May-2016 02:38, Jon D wrote:
>>>> [...]
>>>
>>> The only problem is that it should consider multi-codepoint
>>> replacements aka full-case folding in Unicode.
>>> Otherwise - go ahead and issue a pull request to add
>>> special case for < 0x80.
>
> In case someone wonders, an example for multi-codepoint
> replacements is when ß becomes SS in upper-case.
>
>> What about locale dependent case mappings (e.g. Turkish Iı İi)?
>> Or is that currently not supported by std.uni?
>
> I second those thoughts. Important players* agree that you
> cannot do case conversion or string sorting without knowing
> the locale and sorting scheme, and Phobos that set out to be
> "Unicode first" can't even express locales to begin with.

Proper handling of that is called tailoring in Unicode. Personally I 
don't think it would be terribly hard to go and do:

Locale locale = fetchSomeTailoredLocale(...);
locale.toLower(...); // etc. all functions as members

By the way we do what is called default handling (tables) which is 
sensible "default" IMO.

-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list