compile-time regex redux
Andrei Alexandrescu (See Website For Email)
SeeWebsiteForEmail at erdani.org
Wed Feb 7 16:20:02 PST 2007
Walter Bright wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Walter Bright wrote:
>>> But I think we now have good reasons to revisit this, at least for
>>> compile time use only. For example:
>>>
>>> ("aa|b" ~~ "ababb") would evaluate to "ab"
>>>
>>> I expect one would generally only see this kind of thing inside
>>> templates, not user code.
>>
>> The more traditional way is to mention the string first and pattern
>> second, so:
>>
>> ("ababb" ~~ "aa|b") // match this guy against this pattern
>>
>> And I think it returns "b" - juxtaposition has a higher priority than
>> "|", so your pattern is "either two a's or one b". :o)
>
> My bad. Some more things to think about:
>
> 1) Returning the left match, the match, the right match?
Perl does allow that (has IIRC $` and $' to mark the left and right
surrounding substrings), but the recommended style is to use capturing
parens if you need the left and right portion; this makes all matching
code more efficient.
So if you want to match the left- and right-substrings you say:
("ababb" ~~ "(.*)(aa|b)(.*)")
and you get in return three juicy strings: left, match, and right.
> 2) Returning values of parenthesized expressions?
Probably it's easiest to always return const char[][]. If you don't have
capturing parens, you could return const char[].
> 3) Some sort of sed-like replacement syntax?
Definitely; otherwise it's a pain to express it, particularly because
you can't mutate things during compilation.
("ababb" ~~ s/"(.*)(aa|b)(.*)"/"$1 here was an aa|b $2"/i)
(This doesn't make 's' a keyword; it's just used as punctuation.)
Probably a more D-like syntax could be devised, but that could be also
seen as gratuitous incompatibility with sed, perl etc.
The last "/" is useful because flags could follow it, as is the case
here (i = ignore case).
> An alternative is to have the compiler recognize std.Regexp names as
> being built-in.
Blech. :o)
Andrei
More information about the Digitalmars-d
mailing list