Grokking ranges: some new algorithms and ranges

Philippe Sigaud philippe.sigaud at gmail.com
Sun Nov 1 15:23:32 PST 2009


Hello,

(Uh, first time poster, so hi to all!)

I discovered D2 ranges a few months ago and decided to get a grip on them for the past
few weeks by coding some new ranges. As I'm also reading on Haskell, Clojure and Scala,
I tried to code in D some functions seen elsewhere.

All in all, I'm having a lot of fun and feel like I'm beginning to grok templates and ranges. I've
a module with some new functions inspired by those of std.range and std.algorithm and was wondering if they could be interesting for someone else.

Here is what I have right now (I've a few other small functions I'll write this week). I hope the list is not too indigest... If someone is interested by discussing them, I'm all ears. I also don't know if it's exactly kasher to post a list like this (at least I didn't attach the module), so tell me if I must move the discussion somewhere else.

Algorithms:

* bimap: an extension of map for mapping binary function on a range -> bimap!"a+b"(r) (functions can be standard functions or strings, like in std.algorithm). Application: difference lists (bimap!"b-a"(r))
* nmap : an extension of bimap to map a n-ary function on a range -> nmap!("a*b+sin(c)", 3)(r). Functions can be standard funs or strings. For strings, you must indicate the desired arity (unary, binary, whatever). Application: moving average: nmap!("(a+b+c+d)*0.25", 4)(r)

btw, this uses a template called naryFun(alias fun, size_t arity) which is the generalization of std.functional.unaryFun and .binaryFun: it produces a templated n-args function from a string, the arguments being in the 'a'-'z' range. I've yet to code a CT heuristics determining the probable arity from the string (I've a runtime one, it works, I just have to translate it into templates).

* bifilter: same idea, extending std.algorithm.filter to work with binary predicates -> filter"a>b"(r) -> returns a lazy range whose elements are the pairs verifying the predicate.
* nfilter: as before, a generalization of this idea: filtering data with a n-ary (n-args) predicate. Returns arrays of length n satisfying the predicate -> nfilter!("a<b && b<c",3)(r)

For all these functions, there is also an optional argument: the numbers of steps taken on the range at each popFront. That is : 
int[] r1 = [0,1,2,3,4,5,6,7];
bfilter!"a<b"(r1) : do you want to apply "a<b" on [0,1], [1,2], [2,3] (overlapping segments), ... or on [0,1], [2,3], [4,5] (disjoint segments)?

* tmap: maps a n-args function on n different ranges in parallel (in lockstep) -> nmap!"max"(r1, r2,r3) -> will lazily produce a range with the max of each tuple. For a 'string' function (tmap!"a+b"(r1, r2)) 'a' means elements from the first range, 'b'from  the second, etc.
* tfilter: filters n ranges in parallel with a n-args predicate.
In both cases, the range can have different element types, as long as the predicate or the mapped function is compatible.
Example: say I've r1 = [0,1,2,3], r2 = "abcdef". tmap!"replicate(a+1,b)"(r1, r2) will produce "a", "bb", "ccc", "dddd" and then stops, as r1 is exhausted.

* comprehension: a first try, maybe watered-down, list comprehension for D. The syntax is comprehension!(gen, pred)(ranges). It internally (and lazily) produces the combinations of all input ranges' elements, filter them with n-args pred and apply n-args gen on them.
For example:

// find all pythagorean triplets (triplets (a,b,c) verifying a*a+b*b==c*c) for a,b, and c between 1 and 9. 
// numbers(1,10) is just a small range, akin to iota.
auto lc = comprehension!("tuple(a,b,c)", "a*a + b*b == c*c")(numbers(1,10), numbers(1,10), numbers(1,10))
Output : tuple(3,4,5), tuple(4,3,5)

* setComprehension: as before, but elements are given only once. Uses a range wrapper AsSet!R which filter values already produced.
* parallelComprehension: as before, inspired by Haskell. Just iterates the input ranges in parallel (in lockstep), filters them and generates the values.

* flatMap: maps a range-returning function on elements of a range, and flatten the resulting ranges. If I understand 'Programming in Scala' correctly, that may be the last necessary element to have full-blown list comprehension in D...

* reduceR: well, Haskell has fold, foldr, so I guess D needed it also. Reduces a range 'from the right'. Useful for some problems (for example, computing a real given its continued fraction).
* iterate!(fun)(seed): repeatedly apply fun to seed, produces a range of values.
Example: 
	auto powersOf2 = iterate!"2*a"(1L); // 1, 2, 4, 8, 16, ... (as longs)
	auto sqrt2 = iterate!"(a+2/a)/2"(1.0); // converges to sqrt(2) -> 1.41421...
* scan / scanR: produces a lazy range with all intermediate values of reduce/reduceR. Useful to get, well, intermediate values.
Example: auto factorials = scan!"a*b"(1L, numbers(1, int.max)) -> 1, 1, 2, 6, 24, 120, ...
* unfold: equivalent to Haskell unfold. It's the opposite of reduce: given a seed value and a function, it will generates an infinite range.
I've some nice examples using it to lazily calculate prime numbers or the continued fraction of a real. It's a cousin of std.range.recurrence.

operation on ranges:

* some!pred(range) returns true iff some element in range verify predicate pred
* all!pred(range) returns true iff all elements in range verify pred.
* drop(n, range) returns a copy of range with the n first values dropped
* dropWhile!pred(range) drops value as long as pred if verified -> dropWhile!"a<3"(r). It works also with a n-args predicate.
* popFrontWhile!pred(range). As st.range.popFrontN, it affects the target range.
* cutAt(n, range): cuts a range in two parts at index n. Returns a tuple.
* cutWhen!pred(range): cuts a range in two parts when pred is verified.
* takeWhile!pred(r): takes value in a range as long as pred is verified. Exists for unary predicates, but also for n-ary predicates.
* contains(r1, r2): returns true iff r1 contains all r2 (as one block)
* rotate(n, range): takes the first n elements, chain them at the end.

new ranges:

* replicateRange(n, range): returns a range copied n times.
* cartesianProduct(r1, r2): a range whose elements are all possible pairs of elements (as tuples).
* combinations(r1, r2, ...) a generalization of cartesianProduct, for n ranges. Lazily produces all possible combinations of elements by taking
one in each input range.
* flatten!level(range): flatten a range of ranges (of rank as deep as you want). Default behavior is maximum flattening:
int[] r1 = [0,1,2,3];
int[][] r2 = [r1, [4], [5,6]];
int[][][] r3 = [r2, [[7], [8]], r2];
assert(equal(flatten(r3), [0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6][])); // Max flat
* concat: just one level of flattening. Akin the Haskell concat.
* pairRange: a small zip-like range taking a two ranges. Has for values the 2-tuples created by the input ranges values.
* tupleOfRanges: the same, but generalized to n ranges. I coded this even as std.range.zip exists, as Zip produce Proxy tuples (to allow iteration  beyond the shortest range and to allow swapping/sorting), but I needed something with std.typecons.Tuples as element types. That way interoperation between ranges is easier.
* unzip : extracts a range from a PairRange/TupleRange
* permutations. As the name says, it's a range producing all possible permutations of another range's elements. Don't use it on ranges with more than 17 elements, as 18! > size_t.max...
* stutter!n(range): takes a range and repeat n times its elements. Sorry for the name, but repeat and replicate were already taken.
Eg: stutter!3([0,1,2,3][]) -> 0,0,0,1,1,1,2,2,2,3,3,3.
* extremities(range): alternates between range's two extremities. A cousin of std.range.Radial
extremities([0,1,2,3,4,5][]) -> 0,5,1,4,2,3.
* transverse(r1, r2, r3, ...): like std.range.Transversal, but iterates on all 'columns', stopping when one of the ranges is exhausted. Transverse has for element type the common types of the input ranges.
* segment!(segmentLength, segmentStep)(range): segments a range into segmentLength elements segments, advancing by segmentStep steps.
segment!!3([0,1,2,3,4,5,6][]) -> [0,1,2], [1,2,3], [2,3,4], ... (step defaults to 1).
* without(r1, r2): a range like r1, but with elements of r2 dropped.
 "banana".without("aaa") -> "bnn".

Ok, that's about it. Thanks for those who read it all.
I've some other things, but more minor. If someone is interested, I can send them the module (it's only one module for now). I'm no professional coder, so my code may well be awful, I honestly don't know.

I've also some ideas about extending some std.range/algorithm: for example Map could be a bidirectional range if the mapped-on range is bidir, and also have a length or give random-access if possible. Chain has some bugs with infinite ranges which are easily squashed, and so on.

Ah, and I also have a Chainable!() template to inject some templated opCat capabilities in ranges, as long as Chain has some also. It allows you to write delicious-looking code like this:
auto p = [2,3] ~ unfold!nextPrime([2,3]);
auto p2 = filter!"a<100"(p) ~ map!"a%3"(take(100, p)); // Or whatever...


Bye,

Philippe





More information about the Digitalmars-d mailing list