splitter trouble

John Colvin via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Tue Nov 1 03:59:22 PDT 2016


On Sunday, 30 October 2016 at 23:57:11 UTC, Ali Çehreli wrote:
> While working on a solution for Alfred Newman's thread, I came 
> up with the following interim solution, which compiled but 
> failed:
>
> auto parse(R, S)(R range, S separators) {
>     import std.algorithm : splitter, filter, canFind;
>     import std.range : empty;
>
>     static bool pred(E, S)(E e, S s) {
>         return s.canFind(e);
>     }
>
>     return range.splitter!pred(separators).filter!(token => 
> !token.empty);
> }
>
> unittest {
>     import std.algorithm : equal;
>     import std.string : format;
>     auto parsed = parse("_My   input.string", " _,.");
>     assert(parsed.equal([ "My", "input", "string" ]), 
> format("%s", parsed));
> }
>
> void main() {
> }
>
> The unit test fails and prints
>
> ["put", "ing"]
>
> not the expected
>
> ["My", "input", "string"].
>
> How is that happening? Am I unintentionally hitting a weird 
> overload of splitter?
>
> Ali

As usual, auto-decoding has plumbed the sewage line straight in 
to the drinking water...

Splitter needs to know how far to skip when it hits a match. 
Normally speaking - for the pred(r.front, s) overload that you're 
using here - the answer to that question is always 1. Except in 
the case of narrow strings, where it's whatever the encoded 
length of the separator is in the encoding of the source range 
(in this case utf-8), in order to skip e.g. a big dchar.* But in 
your case, your separator is more than one character, but you 
only want to skip forward one, because your separator isn't 
really a separator.

* see 
https://github.com/dlang/phobos/blob/d6572c2a44d69f449bfe2b07461b2f0a1d6503f9/std/algorithm/iteration.d#L3710

Basically, what you're doing isn't going to work. A separator is 
considered to be a separator, i.e. something to be skipped over 
and twisting the definition causes problems.

This will work, but I can't see any way to make it @nogc:

auto parse(R, S)(R range, S separators) {
     import std.algorithm : splitter, filter, canFind;
     import std.range : save, empty;

     return range
         .splitter!(e => separators.save.canFind(e))
         .filter!(token => !token.empty);
}


More information about the Digitalmars-d-learn mailing list