Why do the same work about 'IndexOfAny' and 'indexOf' function?

FrankLike via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri Jan 9 08:28:01 PST 2015


On Friday, 9 January 2015 at 15:57:21 UTC, ketmar via 
Digitalmars-d-learn wrote:
> On Fri, 09 Jan 2015 15:36:21 +0000
> FrankLike via Digitalmars-d-learn 
> <digitalmars-d-learn at puremagic.com>
> wrote:
>
>> On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via 
>> Digitalmars-d-learn wrote:
>> > On Fri, 09 Jan 2015 13:54:00 +0000
>> > Robert burner Schadek via Digitalmars-d-learn
>> > <digitalmars-d-learn at puremagic.com> wrote:
>> >
>> >> On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via 
>> >> Digitalmars-d-learn wrote:
>> >> > if you *really* concerned with speed here, you'd better 
>> >> > consider using
>> >> > regular expressions. as regular expression can be 
>> >> > precompiled and then
>> >> > search for multiple words with only one pass over the 
>> >> > source string. i
>> >> > believe that std.regex will use variation of Thomson 
>> >> > algorithm for
>> >> > regular expressions when it is able to do so.
>> >> 
>> >> IMO that is not sound advice. Creating the state machine 
>> >> and running will be more costly than using canFind or 
>> >> indexOf how basically only compare char by char.
>> >> 
>> >> If speed is really need use strstr and look if it uses sse 
>> >> to compare multiple chars at a time. Anyway benchmark and 
>> >> then benchmark some more.
>> > std.regex can use CTFE to compile regular expressions (yet 
>> > it sometimes
>> > slower than non-CTFE variant), and i mean that we compile 
>> > regexp before
>> > doing alot of searches, not before each single search. if 
>> > you have alot
>> > of words to match or alot of strings to check, regexp can 
>> > give a huge
>> > boost.
>> >
>> > sure, it all depends of code patterns.
>> import std.regex;
>> auto ctr = ctRegex!(`(home|office|sea|plane)`);
>> auto c2 = !matchFirst("He is in the sea.", ctr).empty;
>> ----------------------------------------------------------
>> Test by  auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_0000);
>> 
>> Result is :
>> filter is          42ms 85us
>> findAmong is       37ms 268us
>> foreach indexOf is 37ms 841us
>> canFind is         13ms
>> canFind indexOf is 39ms 455us
>> ctRegex is         138ms
> 1. stop doing captures in regexp, this will speedup the 
> comparison.
> 2. your sample is very artificial. i was talking about alot more
> keywords and alot longer strings. sorry, i wasn't told that 
> clear
> enough.

Yes. regex doing 'a lot more keywords and a lot longer strings' 
will be better.
Thank you.


More information about the Digitalmars-d-learn mailing list