Would there be interest in a SERIOUS compile-time regex parser?

rm roel.mathys at gmail.com
Mon Oct 16 09:07:59 PDT 2006


Don Clugston wrote:
> In the past, Eric and I both developed compile-time regex engines, but
> they were proof-of-concept rather than something you'd actually use in
> production code. I think this also applies to C++ metaprogramming regexp
> engines, too.
> 
> I've had a bit of play around with the regexp code in Phobos, and have
> convinced myself that it would be straightforward to create a
> compile-time wrapper for the existing engine.
> 
> Usage could be something like:
> --------
> void main()
> {
>     char [] s = "abcabcabab";
>          // case insensitive search
>     foreach(m; rexSearch!("ab+", "i")(s))
>     {
>         writefln("%s[%s]%s", m.pre, m.match(0), m.post);
>     }
> }
> --------
> 
> It would behave *exactly* like the existing std.regexp, except that
> compilation into the internal form would happen via template
> metaprogramming, so that
> (1) all errors would be caught at compile time, and
> (2) there'd be a minor speedup because the compilation step would not
> happen at runtime, and
> (3) otherwise it wouldn't be any faster than the existing regexp.
> However, there'd be no template code bloat, either.
> 
> Existing code would be unchanged. You could even write:
> 
> Regexp a = StaticRegExp!("ab?(ab*)+", "g");
> 
> (assign a pre-compiled regular expression to an existing phobos RegExp).
> 
> There's potentially a greater speedup possible, because the Regexp class
> could become a struct, with no need for any dynamic memory allocation;
> but if this was done, mixing runtime and compile-time regexps together
> wouldn't be as seamless. And of course there's load of room for future
> enhancement.
> 
> BUT...
> 
> The question is -- would this be worthwhile? I'm really not interested
> in making another toy.
> It's straightforward, but tedious, and would double the length of
> std.regexp.
> Would the use of templates be such a turn-off that people wouldn't use it?
> Do the benefits exceed the cost?

I'm not so far as looking into the current regexp module.
But otoh I've already done some of the homework:

template findChar(char[] stringToSearch, char charToFind)
{
  static
    if ( stringToSearch.length == 0
       || stringToSearch[0] == charToFind )
      const int findChar = 0;
    else
      const int findChar
         = 1 + findChar!( stringToSearch[1..stringToSearch.length]
                        , charToFind);
}

gives the position of the char in the string, but if the position ==
length of stringToSearch, charToFind is not present.

I've got some others as well, I can parse an string literal into an
integer :-)

I'm willing to give a hand if you want.

roel




More information about the Digitalmars-d mailing list