An interesting data structure with search time O(sqrt n)

Dmitry Olshansky via Digitalmars-d digitalmars-d at puremagic.com
Mon Nov 30 12:29:43 PST 2015


On 30-Nov-2015 23:13, Andrei Alexandrescu wrote:
> Okasaki's book is a continued inspiration of data structures and
> algorithms.
>
> I was thinking along the following lines: typical collections are
> searchable in linear time. Then we have highly structured collections
> that feature logarithmic search time. But there seems to be nothing in
> between. So I was thinking, what would be a data structure that allows
> O(sqrt n) search times?
>
> After a little tinkering, here's what I came up with.
>
> Start with a simple array of data. Then mentally decompose that array
> into a concatenation of smaller arrays: first has size 1, second has
> size 4, third has size 9, then 16, 25, 36, ... Generally the size of
> these imaginary subarrays grows quadratically. And they're adjacent to
> each other. The last array may be incomplete.
>
> Example: we decompose an array of size 35 into: an array of size 1
> followed by arrays of sizes 4, 9, 16, 5 (last being an incomplete
> fragment of an array of size 25).
>
> Now each of these arrays we organize as a max heap. Moreover, we arrange
> data such that the maximums of these heaps are in INCREASING order. That
> means the smallest element of the entire (initial) array is at the first
> position, then followed by the next 4 smallest organized in a max heap,
> followed by the next 9 smallest organized in a max heap, ... etc. There
> are a total of O(sqrt n) heaps, and each has O(sqrt n) elements.
>
> That's the layout! Now, to search we look at one heap at a time.
> Whenever the maximum element (first element in the subarray) is smaller
> than the searched value, we skip over that entire heap and go to the
> next one. In the worst case we'll skip O(sqrt n) heaps. When the max
> value in a heap is less than the searched element, we found the heap and
> we run a linear search among O(sqrt n) elements.

Reminds me of Van Emde Boas layout which is however fractal in nature: 
sqrt(N) pieces each having sqrt(N) element are subdivided again into 
sqrt(sqrt(N)) pieces and so on.

Not sure if you have seen, but see also cache-oblivious data-structures:
http://erikdemaine.org/papers/BRICS2002/paper.pdf


-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list