compile-time regex redux

Walter Bright newshound at digitalmars.com
Tue Feb 6 21:48:06 PST 2007


String mixins, in order to be useful, need an ability to manipulate 
strings at compile time. Currently, the core operations on strings that 
can be done are:

1) indexed access
2) slicing
3) comparison
4) getting the length
5) concatenation

Any other functionality can be built up from these using template 
metaprogramming.

The problem is that parsing strings using templates generates a large 
number of template instantiations, is (relatively) very slow, and 
consumes a lot of memory (at compile time, not runtime). For example, 
ParseInteger would need 4 template instantiations to parse 5678, and 
each template instantiation would also include the rest of the input as 
part of the template instantiation's mangled name.

At some point, this will prove a barrier to large scale use of this feature.

Andrei suggested using compile time regular expressions to shoulder much 
of the burden, reducing parsing of any particular token to one 
instantiation.

The last time I introduced core regular expressions into D, it was 
soundly rejected by the community and was withdrawn, and for good reasons.

But I think we now have good reasons to revisit this, at least for 
compile time use only. For example:

	("aa|b" ~~ "ababb") would evaluate to "ab"

I expect one would generally only see this kind of thing inside 
templates, not user code.



More information about the Digitalmars-d mailing list