$`, $', $&, $n - sugar or cyclamates? And other topics

James Dunne james.jdunne at gmail.com
Mon Feb 20 12:05:39 PST 2006


Georg Wrede wrote:
> Walter Bright wrote:
> 
>> "Georg Wrede" <georg.wrede at nospam.org> wrote in message 
>> news:43F53BE5.8020900 at nospam.org...
>>
>>> Using regexps in C needs a total change of paradigm. Regexps are
>>> kind of "top down" things, wherease traditionally "peeking into
>>> strings" is bottom-up programming.
>>>
>>> You'd also have to learn regexps. The trivial things are trivial in
>>>  C-style too, and the non-trivial stuff gets avoided because of the
>>>  up-front investment. Folks rather do nested ifs and stuff.
>>>
>>> Conversely, many interpreted languages make it inefficient to do
>>> "peek" kind of programming, as compared to using regexps.
>>
>>
>> There are a lot of cool things you can do in script languages because
>> they are interpreted, and one doesn't care about efficiency. Those
>> things are simply incompatible with D. But I don't see any inherent
>> advantages script languages should have in implementing regex.
> 
> 
> Neither do I.
> 
> But the question was, how come regexps aren't _used_ as much as we'd 
> expect.

My answer is that regular expressions simply aren't powerful enough for 
the kinds of string processing that I need to do regularly (no pun 
intended).  Regular expressions represent regular languages.  Not all 
languages are regular, of course.

<rant>
My other beef with regular expression are that there are so many 
competeing standards for them, and on top of that some are not even 
standardized (i.e. MS Visual Studio .NET 2003).

You never know if one implementation uses longest-match or one uses 
shortest-match; you never know how newlines are handled; you never know 
if Unicode is supported; you never know the run-time performance of your 
regex; you never know the syntax for selecting match indicies (0 based 
or 1 based, use '\1'? Record match with {} or with \(\) or with () ??) etc.

There are simply too many variables with regular expressions as they 
exist in all their forms to be relied upon.  Finally, they're just plain 
ugly and nearly impossible to debug.
</rant>

Following that rant, I can put a positive spin here and say that Ragel 
state machine compiler is an excellent model to work from!  One can 
insert custom code between state transitions for debugging and even for 
complex logic!  Why can't we have compiler-support for this type of 
power? :)

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O 
M--@ V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e 
h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne



More information about the Digitalmars-d mailing list