[OT] n-way union

Mon May 25 18:52:27 PDT 2009

Andrei Alexandrescu wrote:
> Georg Wrede wrote:
>> Andrei Alexandrescu wrote:
>>> You can assume that each array is sorted.
>>
>> Err, you didn't comment on my algorithm, at the end. So, either it is 
>> worthless to an extent not deserving even a dismissal, or you didn't 
>> read the rest of the post.
> 
> I am sorry, at the first two reads I did not understand your algorithm. 
> I noticed that it is more complicated than the canonical one

Canonical one. Righhhhhtttttt......

 From the outset I thought that you did have a solution in mind. And I 
thought that the pause of a couple of days before any seroious attemnts 
only meant that everybody else understood it, too. So, nobody wanted to 
spend 5 days inventing something which would only be dismissed "because 
Knuth did it better 50 years ago".

> and also it necessitates storage to hold the entire output.

I never said whether it'd be on stack or not. An array (or a linked 
list) of structs can be placed anywhere. Especially if you know (i.e. 
can easily count) the total number of items, you can allocate up front, 
be it on the heap or the stack.

> At the third read I did understand it, or at least I understood how
> you mean it to work. It has at least one bug here:
> 
> ========
> Sort the array of structs, and beginning with the first struct, pop from 
> the pointed-to subarray as long as it contains elements equal to the 
> first one. Then pop from the subarray that belongs to following struct, 
> and so on, until a subbarray's first element already is larger.
> ========
> 
> It would fail on the data:
> 
> double[][] a =
> [
>     [ 1, 1, 4, 7, 8 ],
>     [ 7 ],
> ];
> 
> You will pop all 1s and then you'll pop 7. You should be popping 4.

Sigh.

So you didn't read that part carefully. You also didn't read the 
following part, at all.

I admit, I relied too much on your (perceived, superior) ability, and 
thus described the algorithm top-level only.

My first thought was to actually write the whole thing in D2, but then I 
remembered that (as an old man), I code things at most at 10% the speed 
of some of the young and strong here. I didn't want to spend 2 days on 
something I would demand of myself of doing in 2 hours. Hence the 
pseudocode.

My algorithm would pop both the 1's. Then it would see that the 4 is 
bigger and give up. It would then go to a[0+1], and find nothing there 
(i.e. 7 is larger than 1), so it'd consider this round done.

Now, where I didn't "follow description" was, my algorithm was 
insensitive to there being several instances of the same value in a 
subarray. (My mistake, the "1, 1, 1, 4, 4, 7, 7, 7, 7, 8, 8" thing led 
me astray.) BUT IF YOU LOOK CAREFULLY  at my code, you'll see that it 
actually doens't make a difference. Why????

   Because, having popped a number of items from
   a number of sub-arrays, the sub-arrays aren't
   in any order at all. So, there arises a need
   to do some sorting.

INCIDENTALLY, which means, "You will pop all 1s and then you'll pop 7" 
*definitely* means you didn't read my post carefully.

No biggie, my university Pascal teacher did the same thing. Since I'm a 
slow coder, I used to both present my algorithms in "regular peoples' 
language", and my math proofs in writing, instead of a lot of hairy 
equations. Both guys gave me 0's for a number of things, and then I had 
to set up an appointment to see both to explain and clarify. (The one 
thing I remember as a killer was about my thesis (STL), which I 
submitted, then went to Spain for 3 weeks, returned on an overnight 
flight with no sleep at all, and at 8am went to see my score, and found 
myself disqualified. I went straight to the professor and confronted 
him. He explained that this assignment meant one had to prove the case 
for a small value, prove it for the next value, then proove that if it 
was valid for both, then it would have to be valid for the one following 
them. When I said that "this is exactly what I've done here", he went 
silet for 10 minutes, then blushed, and gave me a "barely accepted score"...

Sure, my first idea was to invent and quote real D2 code. But I thought 
that being a "regular" programmer, /everything/ in the code would be 
*rewritten* anyway, so why spend 4 days writing something I can describe 
while I'm f**ing reading the assignment.

Then it turns out I get omitted the first time, the second time I'm not 
understood, the 3rd time I'm not understood either, and the 4th time I'm 
misunderstood.

Were it not for D2, I'd suggest the reader ease up a bit. Save his and 
the audience's time. Getting it (err, reading it) right the first time 
(even if reading at half the speed) is much more efficient (for 
"himself", not to speak of the entire community and their combined 
wasted effort) than spawning a whole tree of response-subtrees, which 
(in the worst case) he has to answer individually. (So everybody loses.)

===================================

Which all means, if somebody else actually did read my description, and 
did implement it, Andrei'd have a hard time beating their code.

(As an additional hint, to save extra speed, one wouldn't wait till the 
turn is over, and then sort hen poppedItems, but instead, insert-sort 
them while iterating over the structs list.)

At the same time, one could keep a tab on the next "least-value" while 
"popping", so as not to have to compare to integers between the current 
and the next actual poppable value.