[OT] n-way union

Mon May 25 14:41:47 PDT 2009

Andrei Alexandrescu wrote:
> Georg Wrede wrote:
>> Andrei Alexandrescu wrote:
>>> This is somewhat OT but I think it's an interesting problem. Consider 
>>> the following data:
>>>
>>> double[][] a =
>>> [
>>>     [ 1, 4, 7, 8 ],
>>>     [ 1, 7 ],
>>>     [ 1, 7, 8],
>>>     [ 4 ],
>>>     [ 7 ],
>>> ];
>>>
>>> We want to compute an n-way union, i.e., efficiently span all 
>>> elements in all arrays in a, in sorted order. You can assume that 
>>> each individual array in a is sorted. The output of n-way union 
>>> should be:
>>>
>>> auto witness = [
>>>     1, 1, 1, 4, 4, 7, 7, 7, 7, 8, 8
>>> ];
>>> assert(equal(nWayUnion(a), witness[]));
>>>
>>> The STL and std.algorithm have set_union that does that for two sets: 
>>> for example, set_union(a[0], a[1]) outputs [ 1, 1, 4, 7, 7, 8 ]. But 
>>> n-way unions poses additional challenges. What would be a fast 
>>> algorithm? (Imagine a could be a very large range of ranges).
>>>
>>> Needless to say, nWayUnion is a range :o).
>>
>> If we'd know anything about the data, such as, the max value is always 
>> smaller than the total number of elements in the subarrays, then we'd 
>> probably more easily invent a decent algorithm.
>>
>> But the totally general algorithm has to be more inefficient. And 
>> constructing (not worst-case, but) tough-case data is trivial. For 
>> example, take a thousand subarrays, each a thousand elements long, 
>> containing random uints from the inclusive range 0..uint.max.
> 
> You can assume that each array is sorted.

Err, you didn't comment on my algorithm, at the end. So, either it is 
worthless to an extent not deserving even a dismissal, or you didn't 
read the rest of the post.