cashew.utils.array 0.1a.2
Mikola Lysenko
mclysenk at mtu.edu
Thu Sep 14 08:44:04 PDT 2006
Chris Nicholson-Sauls wrote:
> /***********************************************************************************
> * Return an array composed of common elements of two arrays.
> */
> T[] intersect (T) (inout T[] alpha, inout T[] beta)
> body {
> T[] result;
>
> foreach (x; alpha) {
> if (beta.contains(x)) {
> result ~= x;
> }
> }
> return result;
> }
> unittest {
> writef("\t.intersect(T[]) ----> ");
> auto foo = array!(int)(1, 2, 3, 4, 5 );
> auto bar = array!(int)( 3, 4, 5, 6, 7);
> assert(foo.intersect(bar) == array!(int)(3, 4, 5));
> writefln("Pass");
> }
>
> /***********************************************************************************
> * Remove duplicate values from an array.
> */
> void unique (T) (inout T[] haystack)
> body {
> size_t ridx ;
>
> for (int i = 0; i < haystack.length; i++) {
> ridx = size_t.max;
> while ((ridx = haystack.rindexOf(haystack[i], ridx)) != i) {
> haystack.removeIndex(ridx);
> }
> }
> }
> unittest {
> writef("\t.unique() ----------> ");
> auto foo = array!(int)(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5);
> foo.unique();
> assert(foo == array!(int)(1, 2, 3, 4, 5));
> writefln("Pass");
> }
>
These algorithms can be implemented more efficiently. The running time
for the intersect is O(n * m) (where n,m are the length of the two
arrays). However, what about the case where alpha and beta are ordered?
Here is a simple algorithm for getting the intersect:
T[] intersect (T) (inout T[] alpha, inout T[] beta)
body
{
/* Assume alpha, beta are ordered such that alpha[i] < alpha[i+1]
* and beta[i] < beta[i+1] for all i.
*/
T[] result;
size_t a_pos = 0;
size_t b_pos = 0;
while(a_pos < alpha.length && b_pos < beta.length)
{
if(alpha[a_pos] == beta[b_pos])
{
result ~= alpha[a_pos];
++a_pos;
++b_pos;
}
else if(alpha[a_pos] < beta[b_pos])
++a_pos;
else if(beta[b_pos] < alpha[a_pos])
++b_pos;
}
return result;
}
The cost for this algorithm is only O(n + m). However, ordering the two
arrays is effectively an O(nlogn + mlogm) operation. Therefore the
running time for this modified algorithm is overall O(nlogn + mlogm). A
similar argument can be made for the unique function.
More information about the Digitalmars-d-announce
mailing list