Does something like std.algorithm.iteration:splitter with multiple seperators exist?

Simen Kjaeraas via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Wed Mar 23 19:15:21 PDT 2016


On Wednesday, 23 March 2016 at 18:10:05 UTC, ParticlePeter wrote:
> Thanks Simen,
> your tokenCounter is inspirational, for the rest I'll take some 
> time for testing.

My pleasure. :) Testing it on your example data shows it to work 
there. However, as stated above, the documentation says it's 
undefined, so future changes (even optimizations and bugfixes) to 
Phobos could make it stop working:

"This predicate must be an equivalence relation, that is, it must 
be reflexive (pred(x,x) is always true), symmetric (pred(x,y) == 
pred(y,x)), and transitive (pred(x,y) && pred(y,z) implies 
pred(x,z)). If this is not the case, the range returned by 
chunkBy may assert at runtime or behave erratically."

> But some additional thoughts from my sided:
> I get all the lines of the file into one range. Calling array 
> on it should give me an array, but how would I use find to get 
> an index into this array?
> With the indices I could slice up the array into four slices, 
> no allocation required. If there is no easy way to just get an 
> index instead of an range, I would try to use something like 
> the tokenCounter to find all the indices.

The chunkBy example should not allocate. chunkBy itself is lazy, 
as are its sub-ranges. No copying of string contents is 
performed. So unless you have very specific reasons to use 
slicing, I don't see why chunkBy shouldn't be good enough.

Full disclosure:
There is a malloc call in RefCounted, which is used for 
optimization purposes when chunkBy is called on a forward range. 
When chunkBy is called on an array, that's a 6-word allocation 
(24 bytes on 32-bit, 48 bytes on 64-bit), happening once. There 
are no other dependencies that allocate.

Such is the beauty of D. :)

--
   Simen


More information about the Digitalmars-d-learn mailing list