Does something like std.algorithm.iteration:splitter with multiple seperators exist?
ParticlePeter via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Wed Mar 23 11:10:05 PDT 2016
On Wednesday, 23 March 2016 at 15:23:38 UTC, Simen Kjaeraas wrote:
> Without a bit more detail, it's a bit hard to help.
>
> std.algorithm.splitter has an overload that takes a function
> instead of a separator:
>
> import std.algorithm;
> auto a = "a,b;c";
> auto b = a.splitter!(e => e == ';' || e == ',');
> assert(equal(b, ["a", "b", "c"]));
>
> However, not only are the separators lost in the process, it
> only allows single-element separators. This might be good
> enough given the information you've divulged, but I'll hazard a
> guess it isn't.
>
> My next stop is std.algorithm.chunkBy:
>
> auto a = ["a","b","c", "d", "e"];
> auto b = a.chunkBy!(e => e == "a" || e == "d");
> auto result = [
> tuple(true, ["a"]), tuple(false, ["b", "c"]),
> tuple(true, ["d"]), tuple(false, ["e"])
> ];
>
> No assert here, since the ranges in the tuples are not arrays.
> My immediate concern is that two consecutive tokens with no
> intervening values will mess it up. Also, the result looks a
> bit messy. A little more involved, and according to
> documentation not guaranteed to work:
>
> bool isToken(string s) {
> return s == "a" || s == "d";
> }
>
> bool tokenCounter(string s) {
> static string oldToken;
> static bool counter = true;
> if (s.isToken && s != oldToken) {
> oldToken = s;
> counter = !counter;
> }
> return counter;
> }
>
> unittest {
> import std.algorithm;
> import std.stdio;
> import std.typecons;
> import std.array;
>
> auto a = ["a","b","c", "d", "e", "a", "d"];
> auto b = a.chunkBy!tokenCounter.map!(e=>e[1]);
> auto result = [
> ["a", "b", "c"],
> ["d", "e"],
> ["a"],
> ["d"]
> ];
> writeln(b);
> writeln(result);
> }
>
> Again no assert, but b and result have basically the same
> contents. Also handles consecutive tokens neatly (but
> consecutive identical tokens will be grouped together).
>
> Hope this helps.
>
> --
> Simen
Thanks Simen,
your tokenCounter is inspirational, for the rest I'll take some
time for testing.
But some additional thoughts from my sided:
I get all the lines of the file into one range. Calling array on
it should give me an array, but how would I use find to get an
index into this array?
With the indices I could slice up the array into four slices, no
allocation required. If there is no easy way to just get an index
instead of an range, I would try to use something like the
tokenCounter to find all the indices.
More information about the Digitalmars-d-learn
mailing list