Improving std.algorithm.find
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Sat Jul 17 15:55:26 PDT 2010
I was thinking of improving std.find. We have this bug report:
http://d.puremagic.com/issues/show_bug.cgi?id=3923
which is pretty vague but does have a point.
For starters, it should be told that std.algorithm.find _does_ a lot,
which at least partially justifies its complexity. One thing that I
haven't seen other find()s doing is to be able to search in one pass a
given range for multiple other ranges, like this:
int[] a = [ 1, 4, 2, 3 ];
assert(find(a, 5, 4, 3) == tuple([ 4, 2, 3 ], 2));
When passed more than two arguments, find returns a tuple continuing the
searched ranged positioned on the element found and the 1-based index of
the parameter that was found. The trick is that find() makes exactly one
pass through the searched range, which is often more efficient than
searching the same range for each element in turn. Also the one-pass
approach works with input ranges so it doesn't put pressure on the range
capabilities.
However the simplified find() looks like, I'd like to keep this feature
unless it brings serious aggravation. Right now it's the #1 factor that
complicates find's signature and implementation.
Another aspect I'd like to discuss is use of Boyer-Moore and related
fast finding techniques. Currently the use of B-M is explicit and a bit
awkward. I was thinking instead to automatically detect its
appropriatedness and use it transparently. Bearophile posted at a link
to such a blend string search routine
(http://effbot.org/zone/stringlib.htm) that I think can be generalized
quite easily.
Any ideas - please chime in.
Andrei
More information about the Digitalmars-d
mailing list