Is str ~ regex the root of all evil, or the leaf of all good?
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Thu Feb 19 07:46:47 PST 2009
Derek Parnell wrote:
> On Thu, 19 Feb 2009 07:01:56 -0800, Andrei Alexandrescu wrote:
>
>> These all put the regex before the string, something many people would
>> find unsavory.
>
> I don't. To me the regex is what you are looking for so it's like saying
> "find this pattern in that string".
Yah, but to most others it's "match this string against that pattern".
Again, regexes have a long history behind them. So probably we need to
have both "find" and "match" with different order of arguments, something .
Anyway, std.algorithm defines find() like this:
find(haystack, needle)
In the least structured case, the haystack is a range and needle is
either an element or another range. But then we can think, hey, we can
think of efficient finds by using a more structured haystack and/or a
more structured needle. So then:
string a = "conoco", b = "co";
// linear find
auto r1 = find(a, b[0]);
// quadratic find
auto r2 = find(a, b);
// organize a in a Boyer-Moore structure; sublinear find
auto r3 = find(boyerMoore(a), b);
I'll actually implement the above, it's pretty nice. Now the question
is, what's the haystack and what's the needle in a regex find?
auto r3 = find("conoco", regex("c[a-z]"));
or
auto r3 = find(regex("c[a-z]"), "conoco");
?
The argument could go both ways:
"Organize the set of 2-char strings starting with 'c' and ending with
'a' to 'z' into a structured haystack, then look for substrings of
"conoco" in that haystack."
versus
"Given the unstructured haystack conoco, look for a structured needle in
it that is any 2-char string starting with 'c' and ending with 'a' to 'z'."
What is the most natural way?
Andrei
More information about the Digitalmars-d
mailing list