Trying to reduce memory usage
Jon Degenhardt
jond at noreply.com
Fri Feb 19 00:13:19 UTC 2021
On Wednesday, 17 February 2021 at 04:10:24 UTC, tsbockman wrote:
> I spent some time experimenting with this problem, and here is
> the best solution I found, assuming that perfect de-duplication
> is required. (I'll put the code up on GitHub / dub if anyone
> wants to have a look.)
It would be interesting to see how the performance compares to
tsv-uniq
(https://github.com/eBay/tsv-utils/tree/master/tsv-uniq). The
prebuilt binaries turn on all the optimizations
(https://github.com/eBay/tsv-utils/releases).
tsv-uniq wasn't included in the different comparative benchmarks
I published, but I did run my own benchmarks and it holds up
well. However, it should not be hard to beat it. What might be
more interesting is what the delta is.
tsv-uniq is using the most straightforward approach of popping
things into an associate array. No custom data structures. Enough
memory is required to hold all the unique keys in memory, so it
won't handle arbitrarily large data sets. It would be interesting
to see how the straightforward approach compares with the more
highly tuned approach.
--Jon
More information about the Digitalmars-d-learn
mailing list