Help optimize D solution to phone encoding problem: extremely slow performace.

Sun Jan 21 17:11:00 UTC 2024

Dne so 20. 1. 2024 21:21 uživatel Renato via Digitalmars-d-learn <
digitalmars-d-learn at puremagic.com> napsal:

> On Saturday, 20 January 2024 at 19:45:19 UTC, Daniel Kozak wrote:
> > On Sat, Jan 20, 2024 at 2:11 PM Renato via Digitalmars-d-learn
> > < digitalmars-d-learn at puremagic.com> wrote:
> >
> >> Wow, fantastic feedback from lots of people, thanks so much!
> >> ...
> >>
> >> > evilrat:
> >> > There is another important difference, i quickly looked up D
> >> > associative array implementation and the search looks like
> >> > nlog(n) complexity with plain loop iteration of hashes,
> >> > whereas
> >> > rust's implementation is using "swisstable" algorithm
> >> > implementation that has packed SIMD optimized lookups, this
> >> > is
> >> > likely where the 3x speed difference comes from.
> >>
> >> I am not using the default hash implementation in Rust (which
> >> is hashbrown as you've found out). That's too slow (because
> >> it's ddos secure - still Java's hash also is and it's still
> >> faster). I had to replace that with a library called `ahash`.
> >>
> >> If you're interested in knowing more about that, please [read
> >> my blog
> >> post](https://renato.athaydes.com/posts/how-to-write-fast-rust-code)
> about optimising the Rust code.
> >>
> >> So, no, that's not where the difference is coming from... in
> >> fact, I am using a faster hashing function in D.
> >>
> >> You can [see my custom hashing function
> >> here](
> >>
> https://github.com/renatoathaydes/prechelt-phone-number-encoding/commit/d1fa1b77ad928f7d536728fba11f2d69b4afe7e3
> ).
> >> This function is not good for the general purpose case, but
> >> it's probably
> >> as good as it gets for this problem (see my int128 trick in a
> >> previous post
> >> on this thread to understand why). I did try a few variations
> >> of hashing
> >> before settling on this one... Rust, if anything, is at a
> >> disadvantage here.
> >>
> >
> > And here you are wrong, it is the hashing when the slowdown
> > comes. It is not only about the speed of hashing function. It
> > is about the quality of hashing functiuon for this particular
> > problem and it is about how hashing table (associative array)
> > is implemented.
>
> I've explained why this function is almost a perfect hash
> function for this problem (there will be almost no collisions
> except for very large inputs where the shift-left will
> "overflow", and even then the probability of collisions should be
> very low). If you're going to claim I'm wrong, you must show
> exactly where I'm wrong, and preferrably you should provide a
> faster implementation. I will even accept a theoretical
> explanation if you can give one. What I will not accept, is for
> you to call me "wrong" just because you say so. That's childish
> behaviour and uncalled for on a technical discussion.
>

If you provided info that hash is perfect, than I am sorry. I have probably
missed that in this thread my fault. Than I will need to looked into this
more carefully and do much more than just assumption.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d-learn/attachments/20240121/4f0bddfe/attachment.htm>