regex - match/matchAll and bmatch - different output

anonymous via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri Jan 1 04:29:01 PST 2016


On 30.12.2015 12:06, Ivan Kazmenko wrote:
> import std.regex, std.stdio;
> void main ()
> {
>      writeln (bmatch   ("abab",  r"(..).*\1"));  // [["abab", "ab"]]
>      writeln (match    ("abab",  r"(..).*\1"));  // [["abab", "ab"]]
>      writeln (matchAll ("abab",  r"(..).*\1"));  // [["abab", "ab"]]
>      writeln (bmatch   ("xabab", r"(..).*\1"));  // [["abab", "ab"]]
>      writeln (match    ("xabab", r"(..).*\1"));  // []
>      writeln (matchAll ("xabab", r"(..).*\1"));  // []
> }
>
> As you can see, bmatch (usage discouraged in the docs) gives me the
> result I want, but match (also discouraged) and matchAll (way to go) don't.
>
> Am I misusing matchAll, or is this a bug?

The `\1` there is a backreference. Backreferences are not part of 
regular expressions, in the sense that they allow you to describe more 
than regular languages. [1]

As far as I know, bmatch uses a widespread matching mechanism, while 
match/matchAll use a different, less common one. It wouldn't surprise me 
if match/matchAll simply didn't support backreferences.

Backreferences are not documented, as far as I can see, but they're 
working in other patterns. So, yeah, this is possibly a bug.


[1] 
https://en.wikipedia.org/wiki/Regular_expression#Patterns_for_non-regular_languages


More information about the Digitalmars-d-learn mailing list