regex issue

Dmitry Olshansky dmitry.olsh at gmail.com
Mon Mar 19 06:50:13 PDT 2012


On 19.03.2012 17:27, Jay Norwood wrote:
> On Monday, 19 March 2012 at 08:05:18 UTC, Dmitry Olshansky wrote:
>> Like I told in main D group it's wrong - regex doesn't only count
>> matches. It finds slices that do match.
>> Thus to make it more efficient, it returns lazy range that does
>> searches on request. "g" - means global :)
>> Then code like this is cool and fast:
>> foreach(m; match(input, ctr))
>> {
>> if(m.hit == "magic we are looking for")
>> break; // <<< ---- no greedy find it all syndrome
>> }
>>
>
> ok, global. So the document implies that I should be able to get a
> single match object with a count of the submatches. So I think maybe
> I've jumped to the wrong conclusion about how to use it, thinking I
> could just use "\n" and "g" flag got get all the matches for the range
> of "\n". So it looks like instead that the term "submatches" needs more
> explanation. What exactly constitutes a submatch? I infered it just
> meant any single match among many.

Maybe a replacement of submatch ---> capture helps. But I thought it was 
easy to get that any subexpression in regex e.g. "(\w+)" is captured 
into submatch. Are you aware sub-expressions in regex are also extracted 
from the text?

>
> //create static regex at compile-time, contains fast native code
> enum ctr = ctRegex!(`^.*/([^/]+)/?$`);
>
> //works just like normal regex:
> auto m2 = match("foo/bar", ctr); //first match found here if any
> assert(m2); // be sure to check if there is a match, before examining
> contents!
> assert(m2.captures[1] == "bar");//captures is a range of submatches, 0 -
> full match

BTW, In the above example what captures are should be clearly visible.

>
>
> btw, I couldn't get this \p option to work for the uni properties. Can
> you provide some example of that which works?
>
> \p{PropertyName} Matches character that belongs to unicode PropertyName
> set. Single letter abreviations could be used without surrounding {,}.
>

Ouch, I see that docs are no good :)
But well, they are reference-like anyway, you might want to take a look 
for more healthy and lengthy overview:
http://blackwhale.github.com/regular-expression.html


-- 
Dmitry Olshansky


More information about the Digitalmars-d-learn mailing list