[Issue 13268] New: Implement greedy alternation in std.regex
    via Digitalmars-d-bugs 
    digitalmars-d-bugs at puremagic.com
       
    Thu Aug  7 11:15:14 PDT 2014
    
    
  
https://issues.dlang.org/show_bug.cgi?id=13268
          Issue ID: 13268
           Summary: Implement greedy alternation in std.regex
           Product: D
           Version: D2
          Hardware: x86
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: Phobos
          Assignee: nobody at puremagic.com
          Reporter: hsteoh at quickfur.ath.cx
Currently, the | operator works on a first-match basis, such that a pattern
like (ab)|(abcd) will never match the second alternative because (ab) is always
matched first.
It would be nice if there was a way to do greedy matching between alternations,
such that an alternation a|b|c|... will always prefer the longest match.
Probably this will have performance implications, so perhaps a "greedy
alternation" operator distinct from | should be used. Maybe something like |*
might be a possible syntax:  (ab)|*(abcd) will capture (abcd) if the input
contains "abcd", but fallback to (ab) only if the input doesn't contain "abcd"
but does contain "ab".
Precedents for greedy alternation include lex / flex, which take a list of
input regexen and always performs longest-match on them. In essence, given a
list of patterns P1, P2, ..., the equivalent of P1 |* P2 |* ... is performed.
--
    
    
More information about the Digitalmars-d-bugs
mailing list