How to use sets in D?

Siarhei Siamashka siarhei.siamashka at gmail.com
Thu Feb 10 00:53:08 UTC 2022


On Wednesday, 9 February 2022 at 21:05:47 UTC, Siarhei Siamashka 
wrote:
> Is the current implementation of associative arrays in D 
> language resistant to Denial of Service hash collision attacks?

Answering to myself. No, it isn't. Here's a simple example:

```D
import std, core.time;

const ulong n = 100000;

// Count the number of unique elements using associative array
ulong count_unique_values_assoc(const ulong[] a)
{
     bool[ulong] set;
     foreach (x ; a)
         set[x] = true;
     return set.length;
}

// Count the number of unique elements using redBlackTree
ulong count_unique_values_rbtree(const ulong[] a)
{
     auto set = redBlackTree!("a<b", false, ulong)();
     foreach (x ; a)
         set.insert(x);
     return set.length;
}

// Generate array, which triggers hash collisions in the
// current D language associative array implementation
ulong[] generate_bad_array(ulong n)
{
     return iota(n).map!(x => (x << 46)).array;
}

// Benchmark associative array vs. rbtree
void run_benchmark(ulong[] a)
{
     auto t1 = MonoTime.currTime;
     auto result_assoc = a.count_unique_values_assoc;
     auto t2 = MonoTime.currTime;
     auto result_rbtree = a.count_unique_values_rbtree;
     auto t3 = MonoTime.currTime;

     assert(result_assoc == result_rbtree);
     writefln("n=%d, unique_elements=%d, assoc time: %d ms, rbtree 
time: %d ms",
              a.length, result_assoc,
              (t2 - t1).total!"msecs",
              (t3 - t2).total!"msecs");
}

void main()
{
     assert([1UL, 1UL, 2UL].count_unique_values_assoc == 2);
     assert([1UL, 1UL, 2UL].count_unique_values_rbtree == 2);

     writefln("== consecutive numbers ==");
     n.iota.array.run_benchmark;

     writefln("\n== random numbers between 0 and 99 ==");
     n.iota.map!(_ => uniform(0UL, 100UL)).array.run_benchmark;

     writefln("\n== adversarially constructed array ==");
     n.generate_bad_array.run_benchmark;
}
```

Benchmark results using a 64-bit LDC compiler:

```
== consecutive numbers ==
n=100000, unique_elements=100000, assoc time: 11 ms, rbtree time: 
20 ms

== random numbers between 0 and 99 ==
n=100000, unique_elements=100, assoc time: 2 ms, rbtree time: 4 ms

== adversarially constructed array ==
n=100000, unique_elements=100000, assoc time: 22626 ms, rbtree 
time: 15 ms
```


More information about the Digitalmars-d-learn mailing list