Which option is faster...

Mon Aug 5 08:18:41 PDT 2013

On Monday, 5 August 2013 at 13:59:24 UTC, jicman wrote:
>
> Greetings!
>
> I have this code,
>
> foreach (...)
> {
>
>   if (std.string.tolower(fext[0]) == "doc" ||
>     std.string.tolower(fext[0]) == "docx" ||
>     std.string.tolower(fext[0]) == "xls" ||
>     std.string.tolower(fext[0]) == "xlsx" ||
>     std.string.tolower(fext[0]) == "ppt" ||
>     std.string.tolower(fext[0]) == "pptx")
>    continue;
> }
>
> foreach (...)
> {
>   if (std.string.tolower(fext[0]) == "doc")
>     continue;
>   if (std.string.tolower(fext[0]) == "docx")
>     continue;
>   if (std.string.tolower(fext[0]) == "xls")
>     continue;
>   if (std.string.tolower(fext[0]) == "xlsx")
>     continue;
>   if (std.string.tolower(fext[0]) == "ppt")
>     continue;
>   if (std.string.tolower(fext[0]) == "pptx")
>    continue;
>   ...
>   ...
> }
>
> thanks.
>
> josé

better:

foreach (...)
{
     auto tmp = std.string.tolower(fext[0]);
     if(tmp == "doc" || tmp == "docx"
        || tmp == "xls" || tmp == "xlsx"
        || tmp == "ppt" || tmp == "pptx")
     {
         continue;
     }
}

but still not super-fast as (unless the compiler is very clever) 
it still means multiple passes over tmp. Also, it converts the 
whole string to lower case even when it's not necessary.

If you have large numbers of possible matches you will probably 
want to be clever with your data structures / algorithms. E.g.

You could create a tree-like structure to quickly eliminate 
possibilities as you read successive letters. You read one 
character, follow the appropriate branch, check if there are any 
further branches, if not then no match and break. Else, read the 
next character and follow the appropriate branch and so on.... 
Infeasible for large (or even medium-sized) character-sets 
without hashing, but might be pretty fast for a-z and a large 
number of short strings.