String Manipulation

okibi okibi at ratedo.com
Tue Jul 10 07:26:15 PDT 2007


That's exactly what I wanted! I tried using regex, but I've never really understood how they work. It's one of those things I think I need someone to spell it out for me lol.

Thanks!

Frits van Bommel Wrote:

> [fixed upside-down reply]
> okibi wrote:
> > okibi Wrote:
> > 
> >> I have a question for you all.
> >>
> >> If I have the following string or char[], how would I get the xsl filename out of it?
> >>
> >> char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";
> >>
> >> Is there a way to get it to return just example.xsl?
> >>
> > Well, I ended up just doing some splits to get what the location of the xsl file will be. Still, I'd like someone to tell me if there is an easier way.
> 
> Did you try regexes (regular expressions)? (see 
> http://www.digitalmars.com/d/1.0/phobos/std_regexp.html)
> 
> (I see Gilles G. has already suggested regexes since I started this 
> post, but I'll post it anyway since I think my suggested regex is better 
> :) )
> 
> For example:
> ---
> import std.regexp;
> import std.stdio;
> 
> void main() {
>      char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" 
> href=\"example.xsl\"?>...";
> 
>      if (auto m = search(myStr, `<\?.*href="(.*)".*\?>`)) {
>          writefln("Match: '%s'", m.match(1));
>      } else {
>          writefln("No match found.");
>      }
> }
> ---
> (Correct for linewrapping before use)
> This picks out the text between (double) quotes after 'href=' in a 
> '<?'-'?>' block. You'll need to be a bit more tricky if you want to 
> handle single quotes as well (or is that a HTML-only thing?). Perhaps 
> `<\?.*href=(["'])(.*)\1.*\?>`: The part between the first parentheses 
> captures the opening quote after 'href=', the \1 says to match the same 
> quote there. The .xsl file name is then m.match(2), since m.match(1) is 
> now the opening quote.
> 
> By the way: note the use of a backquoted string (``s) to avoid escaping 
> the '\'s and '"'s, otherwise the original regexp would be 
> "<\\?.*href=\"(.*)\".*\\?>" which is equivalent but uglier (IMHO). You 
> could also use this for your XML string to avoid '\"' all over the place.
> 
> Also notice the quoting of '?' as '\?' since '?' is a special character 
> in regexes.



More information about the Digitalmars-d-learn mailing list