compile-time regex redux

kris foo at bar.com
Wed Feb 7 14:51:32 PST 2007


Walter Bright wrote:
> String mixins, in order to be useful, need an ability to manipulate 
> strings at compile time. Currently, the core operations on strings that 
> can be done are:
> 
> 1) indexed access
> 2) slicing
> 3) comparison
> 4) getting the length
> 5) concatenation
> 
> Any other functionality can be built up from these using template 
> metaprogramming.
> 
> The problem is that parsing strings using templates generates a large 
> number of template instantiations, is (relatively) very slow, and 
> consumes a lot of memory (at compile time, not runtime). For example, 
> ParseInteger would need 4 template instantiations to parse 5678, and 
> each template instantiation would also include the rest of the input as 
> part of the template instantiation's mangled name.
> 
> At some point, this will prove a barrier to large scale use of this 
> feature.
> 
> Andrei suggested using compile time regular expressions to shoulder much 
> of the burden, reducing parsing of any particular token to one 
> instantiation.
> 
> The last time I introduced core regular expressions into D, it was 
> soundly rejected by the community and was withdrawn, and for good reasons.
> 
> But I think we now have good reasons to revisit this, at least for 
> compile time use only. For example:
> 
>     ("aa|b" ~~ "ababb") would evaluate to "ab"
> 
> I expect one would generally only see this kind of thing inside 
> templates, not user code.

compile-time regex is only part of the picture. A small one too. I 
rather expect we'd wind up finding the manner it was exposed was just 
too limiting in one way or another. Exposing, as was apparently 
suggested, the full API of RegExp inside the compiler sounds a tad 
distasteful.

You'll perhaps forgive me if I question whether this is driven primarily 
from an academic interest?  What I mean is this: if and when D goes 
mainstream, perhaps just one in ten-thousand developers will actually 
use this kind of feature more than 5 times (and still find themselves 
limited). Perhaps I'm being generous with those numbers also?

What is wrong with runtime execution anyway? It sure is easier to write 
and maintain clean D code than (for many ppl) complex concepts that are, 
what amount to, nothing more than runtime optimizations. Isn't that true?

It would seem that adding such features does not address the type of 
things that would be useful to 80% of developers? Surely that should be 
far more important?

And, no ... I'm not just pooh poohing the idea ... I'm really serious 
about D getting some realistic market traction, and I don't see how 
adding more compile-time 'specialities' can help in any way other than 
generating a little bit of 'novelty' interest. Isn't this a good example 
of "premature optimization" ?

Surely some of the others long-term concerns, such as solid debugging 
support, simmering code/dataseg bloat, lib support for templates, etc, 
etc, should deserve full attention instead? Surely that is a more 
successful approach to getting D adopted in the marketplace?

Lot's of questions, and I hope you can give them serious consideration, 
Walter.

- Kris



More information about the Digitalmars-d mailing list