<div dir="ltr">On Tue, Sep 2, 2008 at 11:18 PM, <span dir="ltr"><<a href="mailto:d-bugmail@puremagic.com">d-bugmail@puremagic.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<a href="http://d.puremagic.com/issues/show_bug.cgi?id=2331" target="_blank">http://d.puremagic.com/issues/show_bug.cgi?id=2331</a><br>
<br>
Summary: Enum hashes many times slower than normal hashes<br>
Product: D<br>
Version: unspecified<br>
Platform: PC<br>
OS/Version: Linux<br>
Status: NEW<br>
Severity: normal<br>
Priority: P2<br>
Component: DMD<br>
AssignedTo: <a href="mailto:bugzilla@digitalmars.com">bugzilla@digitalmars.com</a><br>
ReportedBy: <a href="mailto:andrei@metalanguage.com">andrei@metalanguage.com</a><br>
<br>
<br>
This is quite incredible. Consider this map of stopwords for the English<br>
language:<br>
<br>
bool[string] stopWords;<br>
<br>
static this() {<br>
stopWords = [<br>
"a"[]:1, "b":1, "c":1, "d":1, "e":1, "f":1, "g":1, "h":1, "i":1, "j":1,<br>
"k":1, "l":1, "m":1, "n":1, "o":1, "p":1, "q":1, "r":1, "s":1, "t":1, "u":1,<br>
"v":1, "w":1, "x":1, "y":1, "z":1,<br>
"the":1, "a":1,<br>
"about":1, "above":1, "across":1, "afterwards":1, "after":1, "again":1,<br>
"against":1, "ago":1, "almost":1, "along":1,<br>
"already":1, "always":1, "among":1, "anywhere":1, "around":1, "as":1,<br>
"at":1, "away":1,<br>
"back":1, "before":1, "behind":1, "beside":1, "between":1, "beyond":1,<br>
"by":1,<br>
"down":1, "downstairs":1, "during":1,<br>
"else":1, "enough":1, "every":1, "everywhere":1,<br>
"far":1, "for":1, "from":1,<br>
"here":1,<br>
"in":1, "inside":1, "into":1,<br>
"just":1,<br>
"last":1, "lot":1, "lots":1,<br>
"many":1, "middle":1, "much":1,<br>
"near":1, "next":1, "never":1, "not":1, "now":1, "nowhere":1,<br>
"of":1, "off":1, "often":1, "on":1, "once":1, "over":1, "outside":1,<br>
"over":1,<br>
"quite":1,<br>
"rather":1, "recently":1, "rarely":1, "round":1,<br>
"seldom":1, "sometimes":1, "somewhere":1,<br>
"there":1, "through":1, "till":1, "today":1, "to":1, "tomorrow":1,<br>
"tonight":1, "too":1, "towards":1,<br>
"until":1, "up":1, "upstairs":1, "usually":1, "under":1,<br>
"very":1,<br>
"while":1, "with":1, "within":1, "without":1,<br>
"yes":1, "yesterday":1, "yet":1,<br>
"you":1, "he":1, "she":1, "it":1, "we":1, "they":1,<br>
"me":1, "him":1, "her":1, "it":1, "us":1, "them":1,<br>
"myself":1, "yourself":1, "himself":1, "herself":1, "itself":1,<br>
"ourselves":1, "yourselves":1,<br>
"themselves":1, "oneself":1,<br>
"my":1, "your":1, "its":1, "our":1, "their":1,<br>
"mine":1, "yours":1, "hers":1, "ours":1, "theirs":1,<br>
"this":1, "these":1, "those":1,<br>
"some":1, "any":1, "no":1, "none":1,<br>
"other":1, "another":1, "every":1, "all":1, "others":1, "each":1,<br>
"whole":1, "both":1,<br>
"neither":1, "none":1,<br>
"someone":1, "somebody":1, "something":1,<br>
"anyoneanybodyanything":1,<br>
"nobody":1, "nothing":1,<br>
"everyone":1, "everybody":1, "everything":1,<br>
"and":1, "or":1, "but":1, "because":1, "if":1,<br>
"as":1, "like":1, "such":1,<br>
"how":1, "who":1, "why":1, "what":1, "where":1, "whose":1, "when":1,<br>
"whom":1, "which":1,<br>
"bye":1, "hello":1,<br>
"be":1, "was":1, "been":1, "am":1, "is":1, "are":1, "were":1,<br>
"can":1, "could":1,<br>
"come":1, "came":1, "comes":1,<br>
"do":1, "did":1, "done":1, "does":1,<br>
"get":1, "got":1, "gets":1,<br>
"have":1, "had":1, "has":1, "having":1,<br>
"may":1, "might":1,<br>
"must":1,<br>
"shall":1, "should":1,<br>
"ought":1,<br>
"take":1, "took":1, "taken":1, "takes":1, "taking":1,<br>
"use":1, "uses":1, "used":1,<br>
"will":1, "would":1,<br>
"aren't":1,<br>
"cannot":1,<br>
"can't":1,<br>
"couldn't":1,<br>
"didn't":1,<br>
"doesn't":1,<br>
"don't":1,<br>
"wasn't":1,<br>
"wouldn't":1,<br>
"hadn't":1,<br>
"isn't":1,<br>
"one":1, "two":1, "three":1, "four":1, "five":1, "six":1, "seven":1,<br>
"eight":1, "nine":1, "ten":1, "nought":1, "zero":1,<br>
"first":1, "second":1, "third":1, "fourth":1, "fifth":1, "sixth":1,<br>
"seventh":1,<br>
"eighth":1, "ninth":1, "tenth":1,<br>
"ii":1, "iii":1, "iv":1, "vi":1, "vii":1, "viii":1, "ix":1,<br>
"sunday":1, "monday":1, "tuesday":1, "wednesday":1, "thursday":1,<br>
"friday":1, "saturday":1,<br>
"january":1, "february":1, "march":1, "april":1, "may":1, "june":1,<br>
"julyaugustseptember":1, "october":1, "november":1, "december":1,<br>
"date":1, "false":1, "e.geg":1, "i.e":1, "ie":1, "etc":1, "example":1,<br>
"examples":1,<br>
"jrmiss":1, "thing":1, "things":1, "true":1, "year":1,<br>
"badbig":1, "close":1, "difficult":1, "easy":1, "empty":1, "full":1,<br>
"good":1,<br>
"little":1, "long":1, "ready":1, "open":1, "short":1, "tall":1<br>
];<br>
}<br>
<br>
The map can be used in help for parsing English text. Now say we make this<br>
change:<br>
<br>
enum bool[string] stopWords = [ ... same contents ... ];<br>
<br>
Amazingly enough, looking up this new map (with "word in stopWords") is many<br>
many times slower than looking up the old map. What's happening?<br>
<font color="#888888"><br>
<br>
--<br>
<br>
</font></blockquote></div><br>It may (and this is a big "may", you'd have to look at the disassembly to know) be related to <a href="http://d.puremagic.com/issues/show_bug.cgi?id=2237">http://d.puremagic.com/issues/show_bug.cgi?id=2237</a>. In short, I had a constant array (D1, same as an enum array in D2) which, for some reason, DMD was re-constructing the array from scratch upon _every access to it_ instead of putting it in the static data segment. It's possible that the compiler is making the same stupid mistake here.<br>
</div>