Help with an algorithm!

Ivan Kazmenko via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Jun 15 14:00:32 PDT 2017


On Thursday, 15 June 2017 at 13:41:07 UTC, MGW wrote:
> On Thursday, 15 June 2017 at 13:16:24 UTC, CRAIG DILLABAUGH 
> wrote:
>
> The purpose - search of changes in file system.
> Sorting is a slow operation as well as hashing. Creation of a 
> tree, is equally in sorting.
> So far the best result:
>
> foreach(str; m2) {
> 	bool fFind;	int j;
> 	foreach(int i, s; m1) {
> 		if(str == s) { fFind = true; j = i; break; }
> 	}
> 	if(!fFind) { rez ~= str; }
> 	else       { 	m1[j] = m1[$-1]; m1.length = m1.length - 1; 	}
> }

Ugh.  This can work as slow as length-of-m1 *multiplied* by 
length-of-m2.  For 5,000,000 strings, it is 5,000,000 * 5,000,000 
= 25,000,000,000,000.  Granted, if you run it very often, the 
arrays are almost equal, and it's closer to linear.  But once 
there are substantial changes between two consecutive runs, this 
approach is seriously screwed.

Sorting would work in length-of-array * log(length-of-array).  
For 5,000,000 strings, it is 5,000,000 * 23 = 115,000,000.  This 
is ~217,391 times better than your method above.  May be a bit 
slower because of long common prefixes.  Anyway, a couple of 
seconds at most.  How fast you need it to be?  Did you actually 
try it?

Ivan Kazmenko.



More information about the Digitalmars-d-learn mailing list