Build in associated array (DIctionary) needs some enhancements

An Pham home at home.com
Sun Mar 30 16:30:34 UTC 2025


1. No function to create with desired capability
The initial size is default to 8 and increased of power of 2. 
There are two problems with this scheme. To populate to size of 
few thousands, there are too many memory relocations and rebuild 
the list. The other problem is wasted of unused slots; example a 
list of 2_100, the list will have 4_096 of slots

2. Cascade of collision
If hash causes a collision, it will use the subsequent slot as an 
overflow. However, if a next item hash points to that subsequent 
slot and it is already occupied by that overflow item, hence the 
cascade collision

3. Duplicate hash calculation
For a key that fits into register, size_t, there is no hash 
calculation but when it is added into the list, it will apply 
final MurmurHash2; for a key size bigger than size_t, it will do 
hash calculation with applied final MurmurHah2 and later added to 
the list, it will apply MurmurHash2 again. There are two problems 
with this scheme, the first one is obvious un-needed calculation; 
the other is if an integer key value fitted into list 
length/capability, applied MurmurHash2 potentially causes hash 
collision

Some suggestions
1. Use prime number length/capability to avoid wasted of unused 
slots
2. Add a way to initialize list with length/capability
3. A way to avoid final MurmurHash2 or get rid of it completely 
and add function/delegate to customized hash calculation
4. Use different scheme to handle collision such as plain array 
to hold key-value pair and hash list with integer index pointed 
to key-value pair. This scheme has two advantages; key-value pair 
will have integer index for collision and associated list will 
have order of items added when do iteration (possible use in 
json...); The dis-advantage is indirect access and it will take 
more work when doing deletion. However, deletion is rarely used 
and not a big deal for this use case

Happy codings



More information about the Digitalmars-d mailing list