d word counting approach performs well but has higher mem usage
dwdv
dwdv at posteo.de
Sun Nov 4 18:51:07 UTC 2018
>> Assoc array allocations?
>
> Yup. AAs do keep their memory around (supposedly for reuse). [...]
> Why it consumes so much is a question to the implementation.
> [...] I guess built-in AAs just love to hoard.
What a darn shame. This way I'm missing out on all those slick internet
benchmark points. :)
Guess I have to find a workaround then, since it's swapping memory like
crazy on larger real-world inputs on my fairly low-end machine.
>> What did I do wrong?
>
> Well, you didn't actually put the keys into the AA ;) I'm guessing you
> didn't look closely at the output, otherwise you would've noticed that
> something was wrong.
> [...]
> ```
> Error: associative arrays can only be assigned values with immutable
> keys, not char[]
> ```
> [...]
> But when you iterate later, pretty much every key is in fact a reference
> to some older memory, which is still somewhere on the GC heap; you don't
> get a segfault, but neither do you get correct "words".
>
You are absolutely right, I dismissed the aforementioned error without a
second thought as soon as the compiler stopped complaining by throwing a
const declaration at it. Oh well, should have read the assoc array spec
page more thoroughly since it contains this very snippet here:
```
auto word = line[wordStart .. wordEnd];
++dictionary[word.idup]; // increment count for word
```
And yes, using more irregular log files as input instead of the
concatenated uniform dict reveals quite a bit of nonsense that is being
printed to stdout.
Thank you, Stanislav, for taking the time to explain all this.
More information about the Digitalmars-d-learn
mailing list