# [OT] n-way union

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue May 26 08:15:57 PDT 2009

```Jason House wrote:
> Andrei Alexandrescu Wrote:
>
>> Don wrote:
>>> Andrei Alexandrescu wrote:
>>>> I think you can't do better. Complexity is O(n) at a minimum
>>>> because you need to go through each element. The heap-based
>>>> algorithm has O(n log m). What you could improve on is the log
>>>> m factor (m = the number of sets).
>>> Depending on what "going through each element" means. Certainly,
>>> if you need to move them, you can't beat O(n) moves, but you can
>>> definitely get O(m*m) < < O(n) comparisons in the best case I
>>> described. And that would also be the number of splicing
>>> operations.
>> Oh, I see. So yes, comparisons can be reduced. So how exactly do
>> you think that can be done? By binary searching the upper bound of
>> the head to find a long run of identical elements?
>>
>>
>> Andrei
>
> This conversation made me realize a simple optimization (forgive me
> if it's what Don was implying)
>
> Instead of storing m ranges in a heap, store only m-1 ranges. The
> range that would be the min if the heap is kept separate. Each pop
> operation will compare the new front with the min heap element. This
> allows avoiding the re-heaping when the next element comes from the
> same range. If s is the number of times the source range must switch,
> then the performance is O( s*log(m) + n - s). The initial heaping is
> O(m*log(m)), but that's never worse than O(s*log(m)).  Technically,
> the re-heaping steps are log(m-1), but that isn't important with big
> O.

Cool! That can actually be implemented as a micro-optimization with the
classic heap: right after you extract from the top (arr[0]), you look at
its children (which are arr[1] and arr[2]). If both front()'s are still
no less than yours, everything stays as it was and you're home free.

I'll implement this, thanks.

Andrei

```