[Issue 9674] New: std.algorithm.filter problems with non-deterministic _input.front
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Sat Mar 9 08:05:15 PST 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9674
Summary: std.algorithm.filter problems with non-deterministic
_input.front
Product: D
Version: D2
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Phobos
AssignedTo: nobody at puremagic.com
ReportedBy: bearophile_hugs at eml.cc
--- Comment #0 from bearophile_hugs at eml.cc 2013-03-09 08:05:14 PST ---
This Python2.6 program:
from itertools import imap, ifilter
from sys import stdout
def fun(x):
stdout.write("*")
return x
list(ifilter(lambda x: True, imap(fun, xrange(5))))
Prints:
*****
While this similar D program:
import std.stdio: write;
import std.array: array;
import std.range: iota;
import std.algorithm: map, filter;
void main() {
iota(5).map!((x) {
write("*");
return x;
}).filter!(_ => true).array;
}
Prints:
**********
This is a problem if the function given to map is not deterministic, or if it's
costly to evaluate.
Removing both the iota() and the array() from the D code, using a counter, but
keeping both map() and filter():
import std.algorithm: map, filter;
int count = 0;
void main() {
auto r1 = [0, 0, 0, 0, 0];
auto r2 = map!((int x){ count++; return 0; })(r1);
auto r3 = filter!((int y){ return true; })(r2);
foreach (z; r3) {}
assert(count == 10); // passes.
}
To better show what's going on, this is a shortened version of the filter(), it
calls pred() only n times, and it calls _input.front more often:
struct FilterResult(alias pred, Range) {
Range _input;
this(Range r) {
_input = r;
while (!_input.empty && !pred(_input.front)) {
_input.popFront();
}
}
@property bool empty() { return _input.empty; }
void popFront() {
do {
_input.popFront();
} while (!_input.empty && !pred(_input.front));
}
@property auto ref front() {
return _input.front;
}
}
Calling _input.front more often means it assumes _input.front is _like_ a pure
function. But generally in D (and Python) this is not true.
I think at least it should be documented in std.algorithm ddocs that filter()
assumes the front() of its argument range to act like a pure function (despite
there isn't type sytem enforcement of this). But maybe it's better to change
std.algorithm.filter.
This modified filter() caches _input.front and it works with not deterministic
_input.front functions too:
import std.stdio, std.range, std.algorithm, std.traits, std.random;
struct FilterResult2(alias pred, Range) {
import std.traits: ForeachType;
Range _input;
ForeachType!Range rFront;
this(Range r) {
_input = r;
while (!_input.empty) {
this.rFront = _input.front;
if (pred(this.rFront))
break;
_input.popFront();
}
}
@property bool empty() { return _input.empty; }
void popFront() {
do {
_input.popFront();
if (_input.empty)
break;
this.rFront = _input.front;
if (pred(this.rFront))
break;
} while (true);
}
@property auto ref front() {
return this.rFront;
}
}
FilterResult2!(pred, Range) filter2(alias pred, Range)(Range rs) {
return FilterResult2!(pred, Range)(rs);
}
// Follows demo code------
auto distinct(Range)(Range r)
if (isInputRange!Range) {
bool[ForeachType!Range] mySet;
return r.filter!((k) {
if (k in mySet)
return false;
mySet[k] = true;
return true;
});
}
/**
This is a useful function, it yields lazily the distinct items
from the input iterable.
It is similar to uniq, but doesn't need the input range to be sorted.
*/
auto distinct2(Range)(Range r)
if (isInputRange!Range) {
bool[ForeachType!Range] mySet;
return r.filter2!((k) {
if (k in mySet)
return false;
mySet[k] = true;
return true;
});
}
void main() {
100.iota.map!(_ => uniform(0, 10))
.distinct.array.sort.writeln; // Bad output.
100.iota.map!(_ => uniform(0, 10))
.distinct2.array.sort.writeln; // Good output.
}
What are the downsides of replacing std.algorithm.filter() with something
similar to this filter2() (with the missing methods, Unqual, etc)?
This topic was discussed at length in this thread:
http://forum.dlang.org/thread/fcbtejylrwxfploajuto@forum.dlang.org
--
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
More information about the Digitalmars-d-bugs
mailing list