How is std.regex.replaceAllInto more efficient?
Dmitry Olshansky via Digitalmars-d
digitalmars-d at puremagic.com
Fri Oct 16 08:03:45 PDT 2015
On 16-Oct-2015 04:50, Shriramana Sharma wrote:
> Hello. The doc on std.regex.replace{First,All}Into reads:
>
> A variation on replaceAll that instead of allocating a new string on
> each call outputs the result piece-wise to the sink. In particular this
> enables efficient construction of a final output incrementally.
>
> Example:
>
> //swap all 3 letter words and bring it back
> string text = "How are you doing?";
> auto sink = appender!(char[])();
> replaceAllInto!(cap => retro(cap[0]))(sink, text, regex(`\b\w{3}\b`));
> auto swapped = sink.data.dup; // make a copy explicitly
> assert(swapped == "woH era uoy doing?");
> sink.clear();
> replaceAllInto!(cap => retro(cap[0]))(sink, swapped,
> regex(`\b\w{3}\b`));
> assert(sink.data == text);
>
> Now IIUC the code, there are only two calls to replaceAllInto, and after the
> first call the string is dupped,
It is dupped just because we are going to use it as _input_ for the
second call. If instead of swapped the second call would use some other
source the intent of the example would be clearer. I agree it's not a
great example.
> in which case there is an allocation
> nevertheless, so how does using *into make for efficiency?
>
Consider that the appender is being reused in 2 replaceAllInto calls,
therefore the same allocated memory is reused. Otherwise one would have
to allocate on each call to replace.
> Or is it that "each call" means each match of the regex? So there aren't
> just two calls, and that without the sink and *Into, each match of the regex
> will cause a new string allocation and this is what is being avoided?
No. Conceptually replace does replaceInto but creates a new appender
each time. This is the overhead replaceInto means to avoid.
> If so, why can't the default implementation of replace{First,All} itself use
> an internal sink instead of needing the user to manually specify it?
It does. replaceFirst is replaceFirstInto + allocation of appender.
If you already have and appender you can reuse it. Even better if you
were meaning to just print out the result you can pass it
File("some-file").lockingTextWriter bypassing allocation completely.
--
Dmitry Olshansky
More information about the Digitalmars-d
mailing list