regex - match/matchAll and bmatch - different output
Ivan Kazmenko via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sat Jan 2 09:33:41 PST 2016
On Friday, 1 January 2016 at 12:29:01 UTC, anonymous wrote:
> On 30.12.2015 12:06, Ivan Kazmenko wrote:
>> As you can see, bmatch (usage discouraged in the docs) gives
>> me the
>> result I want, but match (also discouraged) and matchAll (way
>> to go) don't.
>>
>> Am I misusing matchAll, or is this a bug?
>
> The `\1` there is a backreference. Backreferences are not part
> of regular expressions, in the sense that they allow you to
> describe more than regular languages. [1]
>
> As far as I know, bmatch uses a widespread matching mechanism,
> while match/matchAll use a different, less common one. It
> wouldn't surprise me if match/matchAll simply didn't support
> backreferences.
>
> Backreferences are not documented, as far as I can see, but
> they're working in other patterns. So, yeah, this is possibly a
> bug.
>
>
> [1]
> https://en.wikipedia.org/wiki/Regular_expression#Patterns_for_non-regular_languages
The overview by the module author
(http://dlang.org/regular-expression.html) does mention in the
last paragraph that backreferences are supported. Looks like it
is a common feature in other programming languages, too.
The "\1" part is working correctly when "abab" or "abxab" or
"ababx" but not "abac". This means it is probably intended to
work, and handling "xabab" incorrectly is a bug.
Also, as I understand it from the docs, matchAll/matchFirst use
the most appropriate of match/bmatch internally, so if match does
not properly support the particular backreference but bmatch
does, the bug is in using the incorrect one to handle a pattern.
At any rate, wrong result with a 8-character pattern produces a
"regex don't work" impression, and I hope something can be done
about it.
More information about the Digitalmars-d-learn
mailing list