[Issue 15773] New: D's treatment of whitespace in character classes in free-from regexes is not the same as Perl's

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Sun Mar 6 04:10:08 PST 2016


https://issues.dlang.org/show_bug.cgi?id=15773

          Issue ID: 15773
           Summary: D's treatment of whitespace in character classes in
                    free-from regexes is not the same as Perl's
           Product: D
           Version: D2
          Hardware: x86_64
                OS: Linux
            Status: NEW
          Severity: minor
          Priority: P1
         Component: phobos
          Assignee: nobody at puremagic.com
          Reporter: d20160306.20.mlaker at spamgourmet.com

In Perl, whitespace in a character class is always significant, even in /x
extend mode:

msl at james:~$ perl -wE 'say "Matched" if "a b" =~ /[c d]/'
Matched
msl at james:~$ perl -wE 'say "Matched" if "a b" =~ /[c d]/x'
Matched
msl at james:~$

D's std.regex ignores whitespace in "x" free-form mode:

msl at james:~$ rdmd --eval='auto rx = regex("[c d]", ""); "a
b".matchFirst(rx).writeln'
[" "]
msl at james:~$ rdmd --eval='auto rx = regex("[c d]", "x"); "a
b".matchFirst(rx).writeln'
[]
msl at james:~$ rdmd --eval='auto rx = ctRegex!("[c d]", ""); "a
b".matchFirst(rx).writeln'
[" "]
msl at james:~$ rdmd --eval='auto rx = ctRegex!("[c d]", "x"); "a
b".matchFirst(rx).writeln'
[]
msl at james:~$

I wasted an hour's debugging time because I didn't expect this difference: I
thought whitespace would always be significant inside a character class. 
Perhaps other developers will have the same expectation that I did.  I don't
suggest that we change the behaviour of std.regex, because it would break too
much existing code, but could we explicitly mention D's behaviour in the docs? 
Many thanks.

--


More information about the Digitalmars-d-bugs mailing list