Probabilistic search - psearch.tar.gz ~~ possible proof of wrongness

Wed Apr 5 08:42:13 PDT 2006

Hasan Aljudy wrote:
> 
> and here's the D code, according to what I understood of the algorithm.
> if something is wrong in my understanding, please point me to it.
> 
> ---------
> import std.stdio;
> import std.random;
> 
> template psearch(T)
> {
>     bool psearch( T[] arr, T element )
>     {
>         if( arr.length == 1 )
>         {
>             if( arr[0] == element ) return true;
>             else return false;
>         }
> 
>         int choice = rand() % 2;
> 
>         T[] first, second;
>         int middle = arr.length / 2;
> 
>         if( choice == 1 )
>         {
>             first = arr[0..middle];
>             second = arr[middle..$];
>         }
>         else
>         {
>             first = arr[middle..$];
>             second = arr[0..middle];
>         }
> 
>         if( !psearch( first, element ) )
>         {
>             return psearch( second, element );
>         }
>         return false;
>     }
> }

Thanks for posting this.  I hadn't taken the time to look at the 
algorithm yet, and this got me interested :-)  The algorithm is 
obviously O(N) for single-element searches, and equivalent in 
theoretical complexity to the naive approach for substring searches.  In 
practical terms it's even worse because of the N recursive function 
calls, but it could always be rewritten as a loop with some bookkeeping.

It's obvious that the algorithm favors certain pattern positions in the 
search string, ie. (1/4)N, (3/4)N, and so on, while traditional searches 
typically perform best if the pattern is near the beginning of the 
string.  Though this raises an interesting point--the probabilistic 
search above is not a "find first" or "find last" algorithm.  Probably 
not a big deal for unordered sets, but it might be otherwise.

Sean