compile-time regex redux

Andrei Alexandrescu (See Website For Email) SeeWebsiteForEmail at erdani.org
Tue Feb 6 23:33:09 PST 2007


Walter Bright wrote:
> String mixins, in order to be useful, need an ability to manipulate 
> strings at compile time. Currently, the core operations on strings that 
> can be done are:
> 
> 1) indexed access
> 2) slicing
> 3) comparison
> 4) getting the length
> 5) concatenation
> 
> Any other functionality can be built up from these using template 
> metaprogramming.
> 
> The problem is that parsing strings using templates generates a large 
> number of template instantiations, is (relatively) very slow, and 
> consumes a lot of memory (at compile time, not runtime). For example, 
> ParseInteger would need 4 template instantiations to parse 5678, and 
> each template instantiation would also include the rest of the input as 
> part of the template instantiation's mangled name.
> 
> At some point, this will prove a barrier to large scale use of this 
> feature.
> 
> Andrei suggested using compile time regular expressions to shoulder much 
> of the burden, reducing parsing of any particular token to one 
> instantiation.

Let's also note for future memento that storing the md5 hash of the name 
instead of the full name is an additional posibility.

> The last time I introduced core regular expressions into D, it was 
> soundly rejected by the community and was withdrawn, and for good reasons.
> 
> But I think we now have good reasons to revisit this, at least for 
> compile time use only. For example:
> 
>     ("aa|b" ~~ "ababb") would evaluate to "ab"
> 
> I expect one would generally only see this kind of thing inside 
> templates, not user code.

The more traditional way is to mention the string first and pattern 
second, so:

("ababb" ~~ "aa|b") // match this guy against this pattern

And I think it returns "b" - juxtaposition has a higher priority than 
"|", so your pattern is "either two a's or one b". :o)

One program I highly recommend for playing with regexes is The Regex 
Coach: http://weitz.de/regex-coach/.


Andrei



More information about the Digitalmars-d mailing list