hyperlink regular expression pattern

John C johnch_atms at hotmail.com
Fri May 9 01:52:32 PDT 2008


I want to split an HTML anchor tag into its constituent parts. I have a regular expression pattern that works with .NET's Regex class, but not with std.regexp - it errors out with "*+? not allowed in atom". I think this means something in the pattern is non-standard.

Here's my code:

if (auto m = std.regexp.search(
  "<a href=\"www.google.com\">Google</a>", 
  r"<a.*?href=[""'](?<url>.*?)[""'].*?>(?<name>.*?)</a>")) {
  string url = m.match(1);
  string name = m.match(2);
}

The problematic parts are "?<url>" and "?<name>" - but not being a whiz with regular expressions, I don't know what to use instead.

Perhaps someone's got a better pattern they could post?

John.


More information about the Digitalmars-d-learn mailing list