Why does "*" cause my tiny regextester program to crash?
Alex Folland
lexlexlex at gmail.com
Sun Jan 30 21:50:00 PST 2011
On 2011-01-30 21:47, Vladimir Panteleev wrote:
> On Mon, 31 Jan 2011 03:57:44 +0200, Alex Folland <lexlexlex at gmail.com>
> wrote:
>
>> I wrote this little program to test for regular expression matches. I
>> compiled it with in Windows with DMD 2.051 through Visual Studio 2010
>> with Visual D. It crashes if regexbuf is just the single character,
>> "*". Why? Shouldn't it match the entire string?
>
> "*" in regular expressions means 0 or more instances of the previous
> entity:
> http://www.regular-expressions.info/repeat.html
> It doesn't make sense at the start of an expression. ".*" is the regexp
> that matches anything[1].
>
> std.regex probably can't handle invalid regexps very well. Note that
> std.regex is a new module that intends to replace the older std.regexp,
> but still has some problems.
Okay, so that particular regex is invalid. Yeah, it still shouldn't
crash. You're right. How should I prevent my program from crashing
without fixing std.regex (code I definitely don't trust myself to
touch)? Would the Scope statement be useful? I still can't figure out
exactly what it does. I tried using scope(exit)writeln("Bad regex.");
just before my foreach loop, but it still crashes. I then tried
changing "exit" to "failure", but that didn't help either; same
behavior. Am I using scope wrong?
>> Also, why does it match an unlimited number of times on "$" instead of
>> just once?
>
> Looks like another std.regex bug.
I thought it through and decided that it might not be std.regex' bug. I
mean, there's no way m could have an unlimited number of elements for
foreach to loop through, right? Actually, it probably is std.regex'
bug. Though, all of this doesn't really matter since nobody uses just
"$" as a regex, since it'd match an obvious point in any input. I bet
Andrei would still be irked by it if he knew though.
>> My debug build is here: http://lex.clansfx.co.uk/projects/regextester.exe
>
> A note for the future: compiled executables aren't very useful when
> source is available, especially considering many people here don't use
> Windows.
Right.
> [1]: A dot in a regular expression may not match newlines, depending on
> the implementation and search options.
Thanks for the extra info. :)
More information about the Digitalmars-d-learn
mailing list