Why do the same work about 'IndexOfAny' and 'indexOf' function?

ketmar via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri Jan 9 07:57:09 PST 2015


On Fri, 09 Jan 2015 15:36:21 +0000
FrankLike via Digitalmars-d-learn <digitalmars-d-learn at puremagic.com>
wrote:

> On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via 
> Digitalmars-d-learn wrote:
> > On Fri, 09 Jan 2015 13:54:00 +0000
> > Robert burner Schadek via Digitalmars-d-learn
> > <digitalmars-d-learn at puremagic.com> wrote:
> >
> >> On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via 
> >> Digitalmars-d-learn wrote:
> >> > if you *really* concerned with speed here, you'd better 
> >> > consider using
> >> > regular expressions. as regular expression can be 
> >> > precompiled and then
> >> > search for multiple words with only one pass over the source 
> >> > string. i
> >> > believe that std.regex will use variation of Thomson 
> >> > algorithm for
> >> > regular expressions when it is able to do so.
> >> 
> >> IMO that is not sound advice. Creating the state machine and 
> >> running will be more costly than using canFind or indexOf how 
> >> basically only compare char by char.
> >> 
> >> If speed is really need use strstr and look if it uses sse to 
> >> compare multiple chars at a time. Anyway benchmark and then 
> >> benchmark some more.
> > std.regex can use CTFE to compile regular expressions (yet it 
> > sometimes
> > slower than non-CTFE variant), and i mean that we compile 
> > regexp before
> > doing alot of searches, not before each single search. if you 
> > have alot
> > of words to match or alot of strings to check, regexp can give 
> > a huge
> > boost.
> >
> > sure, it all depends of code patterns.
> import std.regex;
> auto ctr = ctRegex!(`(home|office|sea|plane)`);
> auto c2 = !matchFirst("He is in the sea.", ctr).empty;
> ----------------------------------------------------------
> Test by  auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_0000);
> 
> Result is :
> filter is          42ms 85us
> findAmong is       37ms 268us
> foreach indexOf is 37ms 841us
> canFind is         13ms
> canFind indexOf is 39ms 455us
> ctRegex is         138ms
1. stop doing captures in regexp, this will speedup the comparison.
2. your sample is very artificial. i was talking about alot more
keywords and alot longer strings. sorry, i wasn't told that clear
enough.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d-learn/attachments/20150109/25440e8a/attachment.sig>


More information about the Digitalmars-d-learn mailing list