How is std.regex.replaceAllInto more efficient?

Dmitry Olshansky via Digitalmars-d digitalmars-d at puremagic.com
Sun Oct 18 09:01:16 PDT 2015


On 18-Oct-2015 17:26, Shriramana Sharma wrote:
> Dmitry Olshansky wrote:
>
>> Listing code that is not routinely tested on each build means someday it
>> may become broken. Anyway just issue a pull request, we can figure out
>> the details in github discussion.
>
> Hmmm... AFAICS the *Into function is most useful when you don't know how
> many items of input you are going to need to process and want to avoid an
> allocation at each item. The command-line byLine()/writeln() number
> delimiter makes a very sensible example there, so I would recommend you put
> that in *somewhere*, though it cannot be a unittest.

Yes - e.g. add to std.regex synopsis to showcase this ability.

>
> For a unittest, how about:

Looks good.

>
> static auto re = ctRegex!(`(?<=\d)(?=(\d\d\d)+\b)`, "g");

Just static

> auto sink = appender!(char [])();
> enum ulong min = 10UL ^^ 10, max = 10UL ^^ 19;
> foreach (i; 0 .. 50)
> {
>      sink.clear();
>      replaceAllInto(sink, text(uniform(min, max)), re, ",");
>      // NOTE: size_t is an unsigned type, so it will wrap over when going
> below zero
>      for (size_t pos = sink.data.length - 4; pos < sink.data.length; pos -=
> 4)
>          assert(sink.data[pos] == ',');
> }

I guess the idiomatic way is:

foreach (pos; iota(0, sink.data.length, 4).retro)
{
...
}

>
> I didn't intentionally do that wrap-around trick, but I can't implicitly
> convert an unsigned type to a signed one either, and using to!long is
> equally as awkward, not to mention that it would not work for overly long
> strings (though that doesn't apply in the present case).

See above.

-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list