Why do the same work about 'IndexOfAny' and 'indexOf' function?

FrankLike via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri Jan 9 07:36:21 PST 2015


On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via 
Digitalmars-d-learn wrote:
> On Fri, 09 Jan 2015 13:54:00 +0000
> Robert burner Schadek via Digitalmars-d-learn
> <digitalmars-d-learn at puremagic.com> wrote:
>
>> On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via 
>> Digitalmars-d-learn wrote:
>> > if you *really* concerned with speed here, you'd better 
>> > consider using
>> > regular expressions. as regular expression can be 
>> > precompiled and then
>> > search for multiple words with only one pass over the source 
>> > string. i
>> > believe that std.regex will use variation of Thomson 
>> > algorithm for
>> > regular expressions when it is able to do so.
>> 
>> IMO that is not sound advice. Creating the state machine and 
>> running will be more costly than using canFind or indexOf how 
>> basically only compare char by char.
>> 
>> If speed is really need use strstr and look if it uses sse to 
>> compare multiple chars at a time. Anyway benchmark and then 
>> benchmark some more.
> std.regex can use CTFE to compile regular expressions (yet it 
> sometimes
> slower than non-CTFE variant), and i mean that we compile 
> regexp before
> doing alot of searches, not before each single search. if you 
> have alot
> of words to match or alot of strings to check, regexp can give 
> a huge
> boost.
>
> sure, it all depends of code patterns.
import std.regex;
auto ctr = ctRegex!(`(home|office|sea|plane)`);
auto c2 = !matchFirst("He is in the sea.", ctr).empty;
----------------------------------------------------------
Test by  auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_0000);

Result is :
filter is          42ms 85us
findAmong is       37ms 268us
foreach indexOf is 37ms 841us
canFind is         13ms
canFind indexOf is 39ms 455us
ctRegex is         138ms


More information about the Digitalmars-d-learn mailing list