[issue std.regex] Fail to match with negative look-ahead assertion when tracking down on a delimiter

k-five via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Tue May 16 08:11:18 PDT 2017


Although I wanted to post this context at: 
https://issues.dlang.org/
but even I registered at,I could not login to.

------------------------------------------

As long as I know a little about RegExp, the two below patterns 
are the same:

[ 1 ]:
^(?:[ab]|ab)(.)(?:(?!\1).)+\1$

[ 2 ]:
^(?:ab|[ab])(.)(?:(?!\1).)+\1$

but the number [ 1 ] is false and the [ 2 ] is true, whereas it 
should be true for both:


------------------------------------------------------------------
code:

void main( immutable string[] args ){
	
	immutable string str = "ab some-word ";
	Regex!( char ) rx = regex( `^(?:[ab]|ab)(.)(?:(?!\1).)+\1$` );
	immutable bool     b1 = !matchFirst( str, rx ).empty();

	writeln( b1 );	// false ( should be true )
	
	rx = regex( `^(?:ab|[ab])(.)(?:(?!\1).)+\1$` );
	immutable bool     b2 = !matchFirst( str, rx ).empty();

	writeln( b2 );	// true
	
}
------------------------------------------------------------------
Demo on regex101.com:
https://regex101.com/r/JV9Ju1/1

the main problem is not related to character class [], since the 
following is true for both:

^(?:ab|[ab])(.)-\1$

^(?:[ab]|ab)(.)-\1$

but with: (.)(?:(?!\1).) it fails if a character-class appears at 
the beginning.

I am not sure but may it is the same bug that GCC below the 
version 5.3.0 have had.

here is my question on Stack-Overflow and found out this bug:
http://stackoverflow.com/questions/42627957/the-same-regex-but-different-results-on-linux-and-windows-only-c




More information about the Digitalmars-d-learn mailing list