Very Stupid Regex question

H. S. Teoh via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Aug 7 10:42:13 PDT 2014


On Thu, Aug 07, 2014 at 05:33:42PM +0000, Justin Whear via Digitalmars-d-learn wrote:
> On Thu, 07 Aug 2014 10:22:37 -0700, H. S. Teoh via Digitalmars-d-learn
> wrote:
> 
> > 
> > So basically you have a file containing regex patterns, and you want
> > to find the longest match among them?
> 
> > 	// Longer patterns match first patterns.sort!((a,b) => a.length >
> > 	b.length);
> > 
> > 	// Build regex string regexStr = "%((%(%c%))%||%)".format
> (patterns);
> > 	auto re = regex(regexStr);
> 
> This only works if the patterns are simple literals.  E.g. the pattern
> 'a +' might match a longer sequence than 'aaa'.  If you're out for the
> longest possible match, iteratively testing each pattern is probably
> the way to go.

Hmm, you're right. I was a bit disappointed to find out that the |
operator in std.regex (and also in Perl's regex) doesn't do
longest-match but first-match. :-( I had always thought it did
longest-match, like in lex/flex.

I wish we can extend std.regex to allow longest-match for
alternations... but there may be performance consequences.


T

-- 
There's light at the end of the tunnel. It's the oncoming train.


More information about the Digitalmars-d-learn mailing list