Compile time regex matching
Philippe Sigaud via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Mon Jul 14 04:42:51 PDT 2014
> I am trying to write some code that uses and matches to regular expressions
> at compile time, but the compiler won't let me because matchFirst and
> matchAll make use of malloc().
>
> Is there an alternative that I can use that can be run at compile time?
You can try Pegged, a parser generator that works at compile-time
(both the generator and the generated parser).
https://github.com/PhilippeSigaud/Pegged
docs:
https://github.com/PhilippeSigaud/Pegged/wiki/Pegged-Tutorial
It's also on dub:
http://code.dlang.org/packages/pegged
It takes a grammar as input, not a single regular expression, but the
syntax is not too different.
import pegged.grammar;
mixin(grammar(`
MyRegex:
foo <- "abc"* "def"?
`));
void main()
{
enum result = MyRegex("abcabcdefFOOBAR"); // compile-time parsing
// everything can be queried and tested at compile-time, if need be.
static assert(result.matches == ["abc", "abc", "def"]);
static assert(result.begin == 0);
static assert(result.end == 9);
pragma(msg, result.toString()); // parse tree
}
It probably does not implement all those regex nifty features, but it
has all the usual Parsing Expression Grammars powers. It gives you an
entire parse result, though: matches, children, subchildren, etc. As
you can see, matches are accessible at the top level.
One thing to keep in mind, that comes from the language and not this
library: in the previous code, since 'result' is an enum, it'll be
'pasted' in place everytime it's used in code: all those static
asserts get an entire copy of the parse tree. It's a bit wasteful, but
using 'immutable' directly does not work here, but this is OK:
enum res = MyRegex("abcabcdefFOOBAR"); // compile-time parsing
immutable result = res; // to avoid copying the enum value everywhere
The static asserts then works (not the toString, though). Maybe
someone more knowledgeable than me on DMD internals could certify it
indeed avoid re-allocating those parse results.
More information about the Digitalmars-d-learn
mailing list