About std.ascii.toLower

Don Clugston dac at nospam.com
Thu Sep 27 01:10:52 PDT 2012


On 20/09/12 18:57, Jonathan M Davis wrote:
> On Thursday, September 20, 2012 18:35:21 bearophile wrote:
>> monarch_dodra:
>>> It's not, it only *operates* on ASCII, but non ascii is still a
>>
>>> legal arg:
>> Then maybe std.ascii.toLower needs a pre-condition that
>> constraints it to just ASCII inputs, so it's free to return a
>> char.
>
> Goodness no.
>
> 1. Operating on a char is almost always the wrong thing to do. If you really
> want to do that, then cast. It should _not_ be encouraged.
>
> 2. It would be disastrous if std.ascii's funtions didn't work on unicode.
> Right now, you can use them with ranges on strings which are unicode, which
> can be very useful.
 > I grant you that that's more obvious with something like
> isDigit than toLower, but regardless, std.ascii is designed such that its
> functions will all operate on unicode strings. It just doesn't alter unicode
> characters and returns false for them with any of the query functions.

Are there any use cases of toLower() on non-ASCII strings?
Seriously? I think it's _always_ a bug.

At the very least that function should have a name like 
toLowerIgnoringNonAscii() to indicate that it is performing a really, 
really foul operation.

The fact that toLower("Ü") doesn't generate an error, but doesn't return 
"ü" is a wrong-code bug IMHO. It isn't any better than if it returned a 
random garbage character (eg, it's OK in my opinion for ASCII toLower to 
consider only the lower 7 bits).

OTOH I can see some value in a cased ASCII vs unicode comparison.
ie, given an ASCII string and a unicode string, do a case-insensitive 
comparison, eg look for
"<HTML>" inside "öähaøſ€đ@<html>ſŋħŋ€ł¶"



More information about the Digitalmars-d-learn mailing list