"Faster than Rust and C++: the PERFECT hash table"

Basile B. b2.temp at gmx.com
Fri Jan 5 17:51:29 UTC 2024


On Monday, 11 December 2023 at 12:37:56 UTC, Siarhei Siamashka 
wrote:
> On Monday, 11 December 2023 at 11:36:10 UTC, Basile.B wrote:
>> To fork on that subject... I was very 
>> "gperf-like-perfecthash-friendly" for years but nowadays I use 
>> [lookuptables](https://gitlab.com/styx-lang/styx/-/blob/master/src/styx/string_switch.sx?ref_type=heads). Very fast too. You have Trie or suffix arrays that are very fast too but they tend not to be memory friendly.
>
> If the whole set of the string identifiers to be parsed is 
> known at compile time, then another approach for mapping them 
> to numbers or whatever else is to use something like 
> https://re2c.org

Sorry for the inaccuracy. This is part of what a compiler uses to 
generate the code to switch over a string. This is not used at 
run-time.

Generated code is more like: https://godbolt.org/z/Y54bKeWYe

What is used at run-time is: 
https://gitlab.com/styx-lang/styx/-/blob/master/library/rtl.sx?ref_type=heads#L426

In fine you have decimated a string into a unique number and you 
just take the path of a normal switch. The decimation is not very 
different of hashing. Just it's guaranteed to give unique indexes 
for a set of unique strings.

I believe D uses a different strategy for that. Binary search 
IIRC. not quite sure.

Anyway just to clarify.


More information about the Digitalmars-d mailing list