Questions about builtin RegExp
Chris Sauls
ibisbasenji at gmail.com
Sun Feb 19 13:42:35 PST 2006
Andrew Fedoniouk wrote:
> "Walter Bright" <newshound at digitalmars.com> wrote in message
> news:dt9ho8$20e4$3 at digitaldaemon.com...
>
>
>>>>Writing a real lexer takes a lot of effort. That's why people invented
>>>>regex, it'll handle most jobs without having to write a lexer. C's
>>>>strtok() is embarassingly inadequate.
>>>
>>>Why?
>>
>>I'd like to see strtok() parse an email address out of a body of text.
>>
>
>
> I don't really understand "parse an email address out of a body of text."
>
> Do you mean something like this:
>
> char* pw = text;
> url u;
>
> forever
> {
> pw = strtok( pw, " \t\n\r" ); if( !pw ) return;
> if( !u.parse(pw) ) continue;
> if( u.protocol() == url::MAILTO )
> //found - do something here
> ;
> };
>
> ?
>
> Andrew.
>
>
I think he meant something more like (using MatchExpr, sorry):
# char[] text = ...;
# char[] addr, user, host, tld;
# if (`([_a-z0-9]*)@([_a-z0-9]*).([_a-z0-9]*)` ~~ text) {
# addr = _match[0];
# user = _match[1];
# host = _match[2];
# tld = _match[3];
#
# // do something
# }
Granted, I just tossed that together in five seconds flat, so its probably not quite
right. I'm just recently starting to lean into the RegExp camp myself. Its made parsing
of Lyra scripts a dream.
One thing I miss from a scripting language in doing the above, is PHP's lovely list()
construct. Pretending we had this in D:
# char[] text = ...;
# char[] addr, user, host, tld;
# if (`([_a-z0-9]*)@([_a-z0-9]*).([_a-z0-9]*)` ~~ text) {
# list(addr,user,host,tld) = _match;
# // do something
# }
-- Chris Nicholson-Sauls
More information about the Digitalmars-d
mailing list