Poll of the week: How should std.regex handle unknown escape

Dmitry Olshansky dmitry.olsh at gmail.com
Sat Dec 3 09:26:33 PST 2011


On 03.12.2011 21:00, Vladimir Panteleev wrote:
> On Sat, 03 Dec 2011 17:51:13 +0200, Dmitry Olshansky
> <dmitry.olsh at gmail.com> wrote:
>
>> treat every \<something> as plain <something> (ignoring \) inside
>> character classes [] if it's not a known escape sequence like \w, \d,
>> \uXXXX, \W, \cA -\cZ and so on.
>
> I think the common intuitive rules regarding escapes in regexes are as
> follows:
>
> 1) Unescaped punctuation usually has special meaning (so people often
> escape all punctuation literals)
> 2) Unescaped letters are literal
> 3) Escaped punctuation is literal
> 4) Escaped letters have special meaning
>
Looks quite sane.

> Therefore, I think that std.regex should throw on unrecognized *letter*
> escapes. It's very likely that the user might be trying to use a
> character class or feature from another regex engine, but unsupported by
> std.regex.
>
Yes, another point for not going for fully blind approach.

-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list