Replacing tango.text.Ascii.isearch

Tue Oct 25 04:17:46 UTC 2022

On Thursday, 13 October 2022 at 08:27:17 UTC, bauss wrote:
>> ```d
>> bool isearch(S1, S2)(S1 haystack, S2 needle)
>> {
>>     import std.uni;
>>     import std.algorithm;
>>     return haystack.asLowerCase.canFind(needle.asLowerCase);
>> }
>> ```
>>
>> untested.
>>
>> -Steve
>
> This doesn't actually work properly in all languages. It will 
> probably work in most, but it's not entirely correct.
>
> Ex. Turkish will not work with it properly.
>
> Very interesting article: 
> http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html

Wow, I didn't expect anything like this and just thought that the 
nightmares of handling 8-bit codepages for non-English languages 
ceased to exist nowadays. Too bad. What are the best practices to 
deal with Turkish text in D language?

For example, [Ukrainian letters 'і' and 
'І'](https://en.wikipedia.org/wiki/Dotted_I_(Cyrillic)) don't 
share the same codes with Latin 'i' and 'I' and this is working 
fine. Except for a possible [phishing 
opportunity](https://www.theguardian.com/technology/2017/apr/19/phishing-url-trick-hackers). Why haven't the standard committees done the same for Turkish 'I' yet?

As for the [German letter 
'ß'](https://en.wikipedia.org/wiki/%C3%9F), wikipedia says that 
the uppercase variant 'ẞ' exists since 2008 (ISO 10646). Do 
German people use it now?
```D
import std;
void main() {
   "ß".asUpperCase.writeln;             // prints "SS"
   "ẞ".asLowerCase.writeln;             // prints "ß"
   "ẞ".asLowerCase.asUpperCase.writeln; // prints "SS"
}
```