compile-time regex redux

Wed Feb 7 18:21:38 PST 2007

Andrei Alexandrescu (See Website For Email) wrote:
> Bill Baxter wrote:
> [snip]
>> I my opinion about regexps is that they're too dense and full of 
>> abbreviations.  And the typical methods for creating them don't 
>> encourage encapsulation and abstraction, which are the foundations of 
>> software.  For instance, every time you look at the above you have to 
>> re-interpret what [A-Z0-9._%-] really means.  When I'm writing regular 
>> expressions I always have to have that chart next to me to remember 
>> all those \s \b \w \S \W \ codes, and then again when trying to figure 
>> out what the code does later.  There has to be a better way.  
>> Apparently the Perl guys thing so too, because they're redoing regular 
>> expressions completely for Perl 6.
> 
> (Well not completely.) That's why we should keep a close eye on those. 
> The Perl community is much more experienced with regex usage than me and 
> possibly yourself. I just want us to not delude ourselves with the idea 
> that we could just sit down and write a better regex syntax just because 
> we don't remember what \s and \b mean. (I happen to remember. :o))

Yes and I don't want us to go and make Perl5-ish regular expressions 
part of the core D language spec without understanding how and why that 
very expert Perl community is changing their regular expressions in the 
next round.  I haven't followed developments with Perl 6 closely, 
though.  Just glanced at the link someone posted the other day.

I also don't want us to go make regexp part of the language spec without 
thoroughly ruling out the potentially much cooler ability to write that 
regexp parser using more fundamental but yet-to-be-invented building blocks.

--bb