improving the join function

Daniel Gibson metalcaedes at gmail.com
Mon Oct 11 19:08:47 PDT 2010


Philippe Sigaud schrieb:
> On Tue, Oct 12, 2010 at 03:28, Simen kjaeraas <simen.kjaras at gmail.com> wrote:
>> Daniel Gibson <metalcaedes at gmail.com> wrote:
>>
>>> (*) Something like
>>> Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,
>>> BinaryPredicate!(T1, T2) joinPred)
>>> just pseudo-code, I'm not really familiar with D2 and std.algorithm.
>>> The idea is you have a Range r1 with elements of type T1, a Range r1 with
>>> elements of type T2 and a predicate that gets a T1 value and a T2 value and
>>> returns bool if they match and in that case a Tuple with those two values is
>>> part of the Range that is returned.
>> Once again I see the combinatorial range in the background. Man, why does
>> this have to be so hard?
>>
>> That is, your join could be implemented as follows, given the
>> combinatorial product range combine:
>>
>>
>> auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange, R
>> ) ) {
>>    return filter!fun( combine( ranges );
>> }
> 
> And IIRC, there is a difference between outer join, inner join and
> some other versions.
> So
> 
> filter!fun(zip(ranges))
> 
> (that is, filtering in parallel) is also a possibilty. I should read
> some again on DB joints.
> There is also the need for creating a range of ranges on this one
> (aka, tensor product, but that scares people when I say that)
> Anyway, that's derailing the thread, so I'll stop now.

zip doesn't work here because it doesn't create a combinatorical/cartesian product[1] that 
(logically) is the foundation of a join[2], but just combines the first element of range one with 
the first element of range two, ... the i-th element of range one with the i-the element of range 
two etc

inner join is the "normal" join, outer join means that, if a to-be-joined element has no "partner" 
in the other set (range), it's included in the output anyway with the partner having a NULL value. 
(This can be done for either the first, the second or both partners).
natural join is like an inner join, but has no explicit predicate, the implicit predicate being that 
(in database tables) columns with equal names have to contain equal values. So natural joins are 
rather uninteresting for ranges I guess.


[1] http://en.wikipedia.org/wiki/Cartesian_product // I called this cross product before, but "cross 
product" seems to be normally used for something else
[2] http://en.wikipedia.org/wiki/Join_%28relational_algebra%29#Joins_and_join-like_operators


More information about the Digitalmars-d mailing list