regex issue

Dmitry Olshansky dmitry.olsh at gmail.com
Mon Mar 19 01:05:16 PDT 2012


On 19.03.2012 6:50, Jay Norwood wrote:
> On Friday, 16 March 2012 at 03:36:12 UTC, Joshua Niehus wrote:
>> Hello,
>>
>> Does anyone know why I would get different results between
>> ctRegex and regex in the following snippet?
>>
>> Thanks,
>> Josh
>>
>>
>
> I'm also having questions about the matchers. From what I understand in
> the docs, if I use this greedy matcher to count lines, it should have
> counted all the lines in the first match (when I hade it outside the
> foreach.

Like I told in main D group it's wrong - regex doesn't only count 
matches. It finds slices that do match.
Thus to make it more efficient, it returns lazy range that does searches 
on request. "g" - means global :)
Then code like this is cool and fast:
foreach(m; match(input, ctr))
{
	if(m.hit == "magic we are looking for")
		break; // <<< ---- no greedy find it all syndrome
}

  In that case, I should have been able to do something like:
>
> matches=match(input,ctr);
> l_cnt = matches.length();
>
> But I only get length=1, and so I'm a bit concerned that greedy is not
> really working. In fact, it is about 3x faster to just run the second
> piece of code, so I think something must be wrong...
>
>
> void wcp_ctRegex(string fn)
> {
> string input = cast(string)std.file.read(fn);
> enum ctr = ctRegex!("\n","g");
> ulong l_cnt;
> foreach(m; match(input,ctr))
> {
> l_cnt ++;
> }
> }
>
>
> void wcp_char(string fn)
> {
> string input = cast(string)std.file.read(fn);
> ulong l_cnt;
> foreach(c; input)
> {
> if (c == '\n')
> l_cnt ++;
> }
> }
>


-- 
Dmitry Olshansky


More information about the Digitalmars-d-learn mailing list