Template regexes, version 2
Georg Wrede
georg.wrede at nospam.org
Tue Feb 21 05:15:50 PST 2006
Awesome!
Don Clugston wrote:
> I've had another go at doing compile-time regexps. With all the bugfixes
> in recent releases, it's now possible to do it with mixins, generating
> local functions instead of global ones. This provides _considerably_
> more flexibility.
>
> The ultimately generated functions can look like:
>
> char [] firstmatch(char [] s)
> {
> char [][] results;
> int someOtherResult;
>
> void func(char [] s) {
> // this parses the string, using the original regexp string.
> // intermediate results (eg expressions in parentheses) are
> // in the local variables: results[][], someOtherResult, etc.
> }
> func(s);
> return results[0];
> }
>
> Usage is like:
> char [] x = firstmatch!("ab+")(someStr);
Nice!!
> It seems to me that there are 3 types of regexes:
> * pure static -- where the regex string is a string literal, known at
> compile time
> * pure dynamic -- eg, as used in a grep utility.
> * pseudo-static. By this I mean regexps where the structure is constant,
> but some strings are replaced with variables.
Pure static is what I always wanted with regexps in D!
Pure dynamic has less of utility to me, but others may disagree. And
that is taken care of by Walter now.
Pseudo-static is cool!! And I believe *immensely* useful.
If I do log analyzers, net statistics programs, serious data mining
frameworks, then pseudo-static regexps is what I do all day long! And,
at the other end, for even most 'dscript' tasks this is the core!
> As far as runtime efficiency goes, it's almost ideal,
Amazing!
> ...And then I return to the D newsgroups after a week and find the
> goalposts have moved: regexps are now built into the language.
Awwww, they're just runtime. Mere syntax sherades to smoke-and-mirror
away the fact.
> The mixin regexps are only at an early stage of development, but given
> the current discussions I think it's important to know what can be done
> with templates (probably more than most people expect).
Most people couldn't. :-)
But then again, "a language can't be Serious if all its features are
graspable to VB programmers".
> In the case of
> what I've called "pseudo-static" regexps, they are arguably more
> powerful than the built-in regexps of DMD 0.147.
I DEMAND that pseudo-static regexps be in the very next release of DMD.
> I don't know where to go from here. There are a few possibilities:
> 1. use template regexps as a demonstration of the power of D templates.
> --> Implement reduced syntax, keep the code simple and not well
> optimised; more of a proof-of-concept; go for "maximum coolness".
From a "marketing point of view", that may be wise.
> 2. Like 1, but with optimisations; try to be the fastest regexp of any
> language. (cleaner-faster-better ... but we use the built-in regexps
> anyway <g>).
I probably would not use the "built-ins" (if that refers to the current
runtime library-come-syntax stuff) hardly at all.
> 3. Create a standard library. This is what I was aiming for,
> but I don't think it makes sense any more.
Of course it does!
> 4. potential integration into the language. Is this even possible?
Wanna guess my vote on this?!
> Probably the most sensible is to go with 1, to wake up the C++/boost
> crowd.
Why wake 'em up????
Unless... you expect them all to abandon C++ on sight and stampede over
here?
Heh, I wonder what 20 Dons would achieve together!
More information about the Digitalmars-d
mailing list