Surprised by hashes of arrays of arrays

Jens Mueller jens.k.mueller at gmx.de
Thu Feb 27 14:20:05 PST 2014


Shammah Chancellor wrote:
> On 2014-02-27 10:11:53 +0000, Jens Mueller said:
> 
> >Dear list,
> >
> >I stumbled over odd behavior which took quite some time of debugging.
> >Sharing my results may help find a solution or just make others aware
> >and reduce their debugging time.
> >
> >To illustrate consider the code:
> >
> >auto array1 = [ [1, 2], [3, 4] ];
> >auto array2 = array1.dup;
> >array2[0] = array2[0].dup;
> >
> >Does either hash(&array1) == hash(&array2) hold or hash(&array1) !=
> >hash(&array2) where hash is defined as
> >auto hash = &(typeid(array1).getHash);
> >?
> >
> >So I may be totally off here (please tell me), but it turns out that
> >that both arrays have different hashes due to the way hashing is
> >implemented. This, e.g., implies that equal arrays may be hashed to
> >different values. Hence when you are using them as keys in an
> >associative array results may be surprising.
> >Note that I can make the problem more difficult to spot by using a
> >struct that uses pointers or an array internally which is hidden by
> >encapsulation.
> >
> >I find the behavior non-obvious but maybe there is reason. The current
> >implementation considers only the direct contents of the struct's memory
> >(the memory starting at array.ptr to array.length * size(array[0]) in
> >case of dynamic arrays) and does *not* follow indirections (e.g. via
> >pointers).
> >
> >This implies that a hash can be computed without considering all memory
> >occupied by a value. In my current mental model of hashing I assumed
> >that all bytes should be read by default.
> >
> >As you may conclude from this post I find this behavior odd. I expect a
> >hash implementation to follow indirections by calling the hashing
> >functions recursively. It looks to me as a case where the default is
> >badly designed/implemented.
> >
> >Jens
> 
> You should get the same answer if in both cases they were static
> arrays, but I believe array1.dup returns a slice to a heap allocated
> array instead of the statically-sized array you originally had.

The original array is a dynamic array. But interesting that it is
obvious that the hashes are not the same. This was really surprising to
me, probably still is.

Jens


More information about the Digitalmars-d mailing list