Why is std.regex slow, well here is one reason!

FeepingCreature feepingcreature at gmail.com
Mon Feb 27 09:01:52 UTC 2023


On Saturday, 25 February 2023 at 13:19:55 UTC, Patrick Schluter 
wrote:
> Languages are complex and often contradictory. The moment you 
> want, f.ex. taking letter cases you're in for the complexity. 
> Uppercase i is different in Turkish than in any other language. 
> ß does not have uppercase (uppercase is SS) but has a titlecase 
> (titlecase is not the same thing as uppercase) ß. Changing 
> cases is not reversible in general (Greek has two lower case 
> sigma but only one uppercase, German again with ß, which 
> becomes SS in uppercase, but not all SS can be ß wenn 
> lowercased). This were just some simple example in Latin 
> scripts.
> Unicode is complex because language is complex. Is it perfect? 
> No. Is it bad, far from it.

Note: ß has an official uppercase version in German, ẞ, that can 
be used in parallel to SS since 2017, and is preferred since 2020.


More information about the Digitalmars-d mailing list