Why is std.regex slow, well here is one reason!

Dmitry Olshansky dmitry.olsh at gmail.com
Thu Mar 2 07:49:56 UTC 2023


On Thursday, 2 March 2023 at 07:35:06 UTC, Dmitry Olshansky wrote:
> On Friday, 24 February 2023 at 20:44:17 UTC, Walter Bright 
> wrote:
>> On 2/24/2023 12:05 PM, Max Samukha wrote:
>>> Is Latin 'A' the same character as Cyrillic 'A'? Should they 
>>> have the same code?
>>
>> It's the same glyph, and so should have the same code. The 
>> definitive test is, when printed out or displayed, can you see 
>> a difference? If the answer is "no" then they should be the 
>> same code.
>
> You’d be surprised but there are typesets where Cyrillic A is 
> visually different from ASCII A.

Also your idea of “what it looks on paper” is basically NFKC or 
NFKD, which is compatibility normalization that folds lookalikes 
into the same canonical codepoint.

I would insist that there are times when “looks the same” is not 
a good option. Typically programs do not have the context, that 
we as humans use to disambiguate.

>
>> Dmitry Olshansky




More information about the Digitalmars-d mailing list