d word counting approach performs well but has higher mem usage

dwdv dwdv at posteo.de
Sun Nov 4 18:51:07 UTC 2018


>> Assoc array allocations?
> 
> Yup. AAs do keep their memory around (supposedly for reuse). [...] 
> Why it consumes so much is a question to the implementation.
> [...] I guess built-in AAs just love to hoard.

What a darn shame. This way I'm missing out on all those slick internet 
benchmark points. :)

Guess I have to find a workaround then, since it's swapping memory like 
crazy on larger real-world inputs on my fairly low-end machine.

>> What did I do wrong?
> 
> Well, you didn't actually put the keys into the AA ;) I'm guessing you 
> didn't look closely at the output, otherwise you would've noticed that 
> something was wrong.
> [...]
> ```
> Error: associative arrays can only be assigned values with immutable 
> keys, not char[]
> ```
> [...]
> But when you iterate later, pretty much every key is in fact a reference 
> to some older memory, which is still somewhere on the GC heap; you don't 
> get a segfault, but neither do you get correct "words".
> 

You are absolutely right, I dismissed the aforementioned error without a 
second thought as soon as the compiler stopped complaining by throwing a 
const declaration at it. Oh well, should have read the assoc array spec 
page more thoroughly since it contains this very snippet here:

```
auto word = line[wordStart .. wordEnd];
++dictionary[word.idup];   // increment count for word
```

And yes, using more irregular log files as input instead of the 
concatenated uniform dict reveals quite a bit of nonsense that is being 
printed to stdout.

Thank you, Stanislav, for taking the time to explain all this.


More information about the Digitalmars-d-learn mailing list