Probabilistic search - psearch.tar.gz

Andrew Fedoniouk news at terrainformatica.com
Tue Apr 4 13:29:00 PDT 2006


"Rioshin an'Harthen" <rharth75 at hotmail.com> wrote in message 
news:e0ugi0$229d$1 at digitaldaemon.com...
> "Sean Kelly" <sean at f4.ca> wrote in message 
> news:e0ud2e$1tvn$1 at digitaldaemon.com...
>> John Demme wrote:
>>> Andrew Fedoniouk wrote:
>>>
>>>> "Sean Kelly" <sean at f4.ca> wrote in message
>>>> news:e0u9gl$1ppd$1 at digitaldaemon.com...
>>>>> Oskar Linde wrote:
>>>>>> I'm not sure if this is intended as a joke or not, but I am sorry to
>>>>>> tell that your code is not logarithmic. It is easy to prove that 
>>>>>> there
>>>>>> are no better algorithm than O(n) for searching an unordered array on 
>>>>>> a
>>>>>> deterministic single thread.
>>>>> Perhaps, but some algorithms have a better average case than others
>>>>> assuming certain (typical) conditions.  If there were simply no point,
>>>>> why all the bother back in the 60s?
>>>>>
>>>>> By the way, as I just woke up, I haven't had a chance to look at the
>>>>> algorithm yet.
>>>>>
>>>> Take your time, no rush, it is impossible anyway.
>>>> To find something in unordered set requires N comparison operations at
>>>> least. Always.
>>>>
>>>> As a morning tea reading I suggest to read US Patent 5,533,051 on
>>>> compression of random data.
>>>> It claims that it will compress two arbitrary bits into one bit.
>>>>
>>>> http://gailly.net/05533051.html
>>>>
>>>>
>>>> Andrew Fedoniouk.
>>>> http://terrainformatica.com
>>>
>>> Come now... I'm sure he's talking about the average case, not the worst
>>> case.
>>
>> I don't think it matters.  No matter the algorithm, you must examine 
>> every element of the search string.  It's just that typical searches are 
>> actually worse than O(N), as they often examine each element more than 
>> once.
>
> You have to examine every element of the search string? How come I haven't 
> heard that? :)
>
> Try (approximately - my C(++) experience shows too much):
>
> char *needleInHaystack(char *haystack, char needle)
> {
>    if( !haystack )
>        return null;
>
>    for( char *loc = haystack; *loc != '\0'; loc++ )
>        if( *loc == needle )
>            return loc;
>
>    return null;
> }
>
> This definitely does not examine every element of the string, except in 
> the worst-case scenario - just enough to find the first occurence, no 
> matter what the string is or what character the needle is.
>

Lets assume that here unordered set (string) has flat distribution of 
element(chars) occurences.

You can cry, pray, read C++ books, whatever you like, but it will take you 
N/2 comparisons average
and in the worst case N comparisons to find first occurence of something 
there.

The only thing which affect N/2 average is the quality of your random 
generator (flatness of distribution).
And this quality is usual reason of "wonders" in such discoveries.

Andrew Fedoniouk.
http://terrainformatica.com















More information about the Digitalmars-d mailing list