Formal Review of std.regex (FReD)

Rainer Schuetze r.sagitario at gmx.de
Sun Oct 23 00:28:41 PDT 2011



On 22.10.2011 21:05, Dmitry Olshansky wrote:
> On 22.10.2011 20:56, Rainer Schuetze wrote:
>> I haven't followed the discussion closely, and I cannot really comment
>> on the core regex functionality, but I did actually use FReD as a
>> replacement of a buggy std.regex once.
>>
>> In that case I wanted to have a lazily created static regex, but I did
>> not find an official way to test whether a Regex has been initialized:
>>
>> static Regex!char re;
>> if(!isInitializedRE(re))
>> re = regex(r"^(.*)\(([0-9]+)\):(.*)$");
>>
>> So I implemented isInitializedRE() as "re.ir !is null" for std.regex and
>> "re.captures() > 0" for fred, but that fails for being a "drop-in
>> replacement".
>
> Coincidentally, you still can access re.ir property in this way.
> Wow, I wonder how far with backwards compatibility I can go :)
>
> In both cases this relies on undocumented features.
> Even now I can suggest a more portable and entirely generic way:
>
> if(re == Regex!(char).init)
> {
> //create re
> }
>
> Though that risks doing more work then needed.
>
>>
>> I think, both versions use implementation specifics, maybe there should
>> be a documented way to test for being initialized.
>>
>
> Definitely. How about adding an empty property + opCast to bool, with
> that you'd get:
> if(!re)
> {
> //create re
> }
>
> and a bit more verbose:
> if(re.empty)
> {
> //create re
> }

I think, this might be confused with normal usage, like "is this regex 
the empty string?" (Is "" a valid regex?). Maybe a more explicite 
"valid()" predicate would be fine.

>
>> I also noticed, that "auto match(R, RegEx)(R input, RegEx re);" appears
>> twice in the documentation, same for "bmatch". I guess they should not
>> appear together with the string versions.
>>
>
> I gather that happens because there is another overload specifically for
> C-T regexes. It's docs state just that, but lacking the template
> constraint signatures are the same, so it indeed can cause some confusion.
> Maybe it would be better to just combine docs together, and leave one
> overload undocumented.
>

As RegEx is a template argument here, it can stand for both Regex and 
StaticRegex, and that should be mentioned. Whether it has two different 
implementations is an implementation detail that does not need to bother 
the user.

If you want to keep the second entries, I'd recommend renaming the 
argument to StaticRegEx.


More information about the Digitalmars-d mailing list