How is std.regex.replaceAllInto more efficient?

Dmitry Olshansky via Digitalmars-d digitalmars-d at puremagic.com
Fri Oct 16 08:03:45 PDT 2015


On 16-Oct-2015 04:50, Shriramana Sharma wrote:
> Hello. The doc on std.regex.replace{First,All}Into reads:
>
>      A variation on replaceAll that instead of allocating a new string on
> each call outputs the result piece-wise to the sink. In particular this
> enables efficient construction of a final output incrementally.
>
>      Example:
>
>      //swap all 3 letter words and bring it back
>      string text = "How are you doing?";
>      auto sink = appender!(char[])();
>      replaceAllInto!(cap => retro(cap[0]))(sink, text, regex(`\b\w{3}\b`));
>      auto swapped = sink.data.dup; // make a copy explicitly
>      assert(swapped == "woH era uoy doing?");
>      sink.clear();
>      replaceAllInto!(cap => retro(cap[0]))(sink, swapped,
> regex(`\b\w{3}\b`));
>      assert(sink.data == text);
>
> Now IIUC the code, there are only two calls to replaceAllInto, and after the
> first call the string is dupped,

It is dupped just because we are going to use it as _input_ for the 
second call. If instead of swapped the second call would use some other 
source the intent of the example would be clearer. I agree it's not a 
great example.

> in which case there is an allocation
> nevertheless, so how does using *into make for efficiency?
>

Consider that the appender is being reused in 2 replaceAllInto calls, 
therefore the same allocated memory is reused. Otherwise one would have 
to allocate on each call to replace.

> Or is it that "each call" means each match of the regex? So there aren't
> just two calls, and that without the sink and *Into, each match of the regex
> will cause a new string allocation and this is what is being avoided?

No. Conceptually replace does replaceInto but creates a new appender 
each time. This is the overhead replaceInto means to avoid.

> If so, why can't the default implementation of replace{First,All} itself use
> an internal sink instead of needing the user to manually specify it?

It does. replaceFirst is replaceFirstInto + allocation of appender.
If you already have and appender you can reuse it. Even better if you 
were meaning to just print out the result you can pass it 
File("some-file").lockingTextWriter bypassing allocation completely.


-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list