Formal Review of std.regex (FReD)

Rainer Schuetze r.sagitario at gmx.de
Mon Oct 24 22:28:30 PDT 2011



On 23.10.2011 17:46, Dmitry Olshansky wrote:
> On 23.10.2011 11:28, Rainer Schuetze wrote:
>>
>>
>> On 22.10.2011 21:05, Dmitry Olshansky wrote:
>>> On 22.10.2011 20:56, Rainer Schuetze wrote:
>>>> I haven't followed the discussion closely, and I cannot really comment
>>>> on the core regex functionality, but I did actually use FReD as a
>>>> replacement of a buggy std.regex once.
>>>>
>>>> In that case I wanted to have a lazily created static regex, but I did
>>>> not find an official way to test whether a Regex has been initialized:
>>>>
>>>> static Regex!char re;
>>>> if(!isInitializedRE(re))
>>>> re = regex(r"^(.*)\(([0-9]+)\):(.*)$");
>>>>
>>>> So I implemented isInitializedRE() as "re.ir !is null" for std.regex
>>>> and
>>>> "re.captures() > 0" for fred, but that fails for being a "drop-in
>>>> replacement".
>>>
>>> Coincidentally, you still can access re.ir property in this way.
>>> Wow, I wonder how far with backwards compatibility I can go :)
>>>
>>> In both cases this relies on undocumented features.
>>> Even now I can suggest a more portable and entirely generic way:
>>>
>>> if(re == Regex!(char).init)
>>> {
>>> //create re
>>> }
>>>
>>> Though that risks doing more work then needed.
>>>
>>>>
>>>> I think, both versions use implementation specifics, maybe there should
>>>> be a documented way to test for being initialized.
>>>>
>>>
>>> Definitely. How about adding an empty property + opCast to bool, with
>>> that you'd get:
>>> if(!re)
>>> {
>>> //create re
>>> }
>>>
>>> and a bit more verbose:
>>> if(re.empty)
>>> {
>>> //create re
>>> }
>>
>> I think, this might be confused with normal usage, like "is this regex
>> the empty string?" (Is "" a valid regex?). Maybe a more explicite
>> "valid()" predicate would be fine.
>
> "" is a valid regex that matches anywhere, with global flag it will
> match before any codepoint + once at end.
> I'm not sure using 'valid' is good, it may mislead user to check it all
> over the place e.g.:
> auto r = regex("blah");
> if(r.valid())
> ....
>

You may be right. Maybe 'initialized', otherwise 'empty' isn't too bad 
as well. But I think it should be explicite, so I would not add opCast 
to bool.


More information about the Digitalmars-d mailing list