Help optimize D solution to phone encoding problem: extremely slow performace.

Tue Jan 16 21:15:19 UTC 2024

On Tuesday, 16 January 2024 at 20:34:48 UTC, H. S. Teoh wrote:
> On Tue, Jan 16, 2024 at 12:28:49PM -0800, H. S. Teoh via 
> Digitalmars-d-learn wrote: [...]
>> Anyway, I've fixed the problem, now my program produces the 
>> exact same output as Renato's repo. Code is posted below.
> [...]
>

Great, I ran the benchmarks for you :)

I had to change how you accept arguments, even though you did 
"the right thing" using `getopt`, the solutions should just take 
a `count` or `print` argument first...

Anyway, here's your result:

```
===> ./rust
./rust,24133632,25
./rust,24739840,130
./rust,24477696,536
./rust,25247744,1064
./rust,8175616,6148
./rust,8306688,8315
===> src/d/dencoder
src/d/dencoder,46055424,43
src/d/dencoder,96337920,146
src/d/dencoder,102350848,542
src/d/dencoder,102268928,1032
src/d/dencoder,40206336,99936
^C
```

It took too long with the `count` option, so I had to abort 
before the last run ended... there's probably some bug there, 
otherwise the Trie runs very fast, as I had expected.

For the record (I already posted this on GitHub)... here's [my 
current fastest 
solution](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/dlang-key-hash-incremental/src/d/src/dencoder.d) time using the same algorithm as Rust (after I fixed my solution that was using `int128`, which was not valid, as @ssvb pointed out):

```
Proc,Run,Memory(bytes),Time(ms)
===> ./rust
./rust,23986176,24
./rust,24346624,118
./rust,24543232,525
./rust,24346624,1027
./rust,8011776,3830
./rust,8486912,18403
===> src/d/dencoder
src/d/dencoder,13615104,30
src/d/dencoder,24723456,374
src/d/dencoder,24592384,1750
src/d/dencoder,24788992,3472
src/d/dencoder,11517952,10235
src/d/dencoder,11517952,51211
```

Not great :(

But @ssvb came up with a really fast implementation, though 
totally different algorithm (so dont' take this as evidence of D 
being faster than Rust or anything like that):

```
Proc,Run,Memory(bytes),Time(ms)
===> ./rust
./rust,23986176,24
./rust,24461312,135
./rust,24494080,514
./rust,24526848,981
./rust,8175616,3647
./rust,8011776,15404
===> src/d/dencoder
src/d/dencoder,16433152,30
src/d/dencoder,16613376,125
src/d/dencoder,16613376,468
src/d/dencoder,16613376,941
src/d/dencoder,5570560,12
src/d/dencoder,6701056,18
```

I can't explain why it's so incredibly fast, specially for the 
`count` case. I tried using the same hashing function on my 
solution, but that didn't really help me!

Pretty cool to see different solutions to the problem, but I'm 
still curious to know where the solution I wrote is being slow 
compared to Rust which is using identical, to my understanding, 
code! I'm sure people could write incredibly fast code to solve 
this problem, but what I am really curious about is what the code 
I wrote is doing wrong that causes it to run 4x slower than Rust 
despite doing "the same thing"...