Which option is faster...
monarch_dodra
monarchdodra at gmail.com
Mon Aug 5 10:04:34 PDT 2013
On Monday, 5 August 2013 at 15:18:42 UTC, John Colvin wrote:
>
> better:
>
> foreach (...)
> {
> auto tmp = std.string.tolower(fext[0]);
> if(tmp == "doc" || tmp == "docx"
> || tmp == "xls" || tmp == "xlsx"
> || tmp == "ppt" || tmp == "pptx")
> {
> continue;
> }
> }
>
> but still not super-fast as (unless the compiler is very
> clever) it still means multiple passes over tmp. Also, it
> converts the whole string to lower case even when it's not
> necessary.
>
> If you have large numbers of possible matches you will probably
> want to be clever with your data structures / algorithms. E.g.
>
> You could create a tree-like structure to quickly eliminate
> possibilities as you read successive letters. You read one
> character, follow the appropriate branch, check if there are
> any further branches, if not then no match and break. Else,
> read the next character and follow the appropriate branch and
> so on.... Infeasible for large (or even medium-sized)
> character-sets without hashing, but might be pretty fast for
> a-z and a large number of short strings.
Arguably, you'd do even better with:
foreach (...)
{
auto tmp = std.string.tolower(fext[0]);
switch(tmp)
{
case "doc", "docx":
case "xls", "xlsx":
case "ppt", "pptx":
continue;
default:
}
}
Since it gives the compiler more wiggle room to optimize things
as it wishes. For example, it *could* (who knows :D !) implement
the switch as a hash table, or a tree.
BTW, a very convenient "tree-like" structure that could be used
is a "heap": it is a basic binary tree, but stored inside an
array. You could build it during compilation, and then simply
search it.
A possible optimization is to first switch on string length:
foreach (...)
{
auto tmp = std.string.tolower(fext[0]);
switch(tmp.length)
{
case 3:
switch(tmp)
{
case "doc", "xls", "ppt":
continue;
default:
}
break;
case 4:
switch(tmp)
{
case "docx", "xlsx", "pptx":
continue;
default:
}
break;
default:
}
}
That said, I'm not even sure this would be faster, so a bench
would be called for. Further more, I'd really be tempted to say
that at this point, we are in the realm of premature optimization.
More information about the Digitalmars-d-learn
mailing list