Compiler benchmarks for an alternative to std.uni.asLowerCase.

Marco Leise via Digitalmars-d digitalmars-d at puremagic.com
Thu May 12 07:23:49 PDT 2016


Am Wed, 11 May 2016 11:37:03 +0000
schrieb Marc Schütz <schuetzm at gmx.net>:

> On Monday, 9 May 2016 at 08:44:53 UTC, Dmitry Olshansky wrote:
> > On 09-May-2016 02:38, Jon D wrote:  
> >> [...]  
> >
> > The only problem is that it should consider multi-codepoint 
> > replacements aka full-case folding in Unicode.
> > Otherwise - go ahead and issue a pull request to add
> > special case for < 0x80.

In case someone wonders, an example for multi-codepoint
replacements is when ß becomes SS in upper-case.

> What about locale dependent case mappings (e.g. Turkish Iı İi)? 
> Or is that currently not supported by std.uni?

I second those thoughts. Important players* agree that you
cannot do case conversion or string sorting without knowing
the locale and sorting scheme, and Phobos that set out to be
"Unicode first" can't even express locales to begin with.

*
 C++:
 http://en.cppreference.com/w/cpp/locale/tolower
 POSIX:
 http://pubs.opengroup.org/onlinepubs/9699919799/functions/tolower.html
 Java:
 https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#toLowerCase%28java.util.Locale%29
 Go:
 Recently added special handling of Turkish, but lack of
 full-case-folding is still a "BUG" comment in the code.

The actual discussion on std.locale is here:
http://forum.dlang.org/thread/gof4ft$2odv$1@digitalmars.com

-- 
Marco



More information about the Digitalmars-d mailing list