Probabilistic search - psearch.tar.gz
Andrew Fedoniouk
news at terrainformatica.com
Tue Apr 4 13:29:00 PDT 2006
"Rioshin an'Harthen" <rharth75 at hotmail.com> wrote in message
news:e0ugi0$229d$1 at digitaldaemon.com...
> "Sean Kelly" <sean at f4.ca> wrote in message
> news:e0ud2e$1tvn$1 at digitaldaemon.com...
>> John Demme wrote:
>>> Andrew Fedoniouk wrote:
>>>
>>>> "Sean Kelly" <sean at f4.ca> wrote in message
>>>> news:e0u9gl$1ppd$1 at digitaldaemon.com...
>>>>> Oskar Linde wrote:
>>>>>> I'm not sure if this is intended as a joke or not, but I am sorry to
>>>>>> tell that your code is not logarithmic. It is easy to prove that
>>>>>> there
>>>>>> are no better algorithm than O(n) for searching an unordered array on
>>>>>> a
>>>>>> deterministic single thread.
>>>>> Perhaps, but some algorithms have a better average case than others
>>>>> assuming certain (typical) conditions. If there were simply no point,
>>>>> why all the bother back in the 60s?
>>>>>
>>>>> By the way, as I just woke up, I haven't had a chance to look at the
>>>>> algorithm yet.
>>>>>
>>>> Take your time, no rush, it is impossible anyway.
>>>> To find something in unordered set requires N comparison operations at
>>>> least. Always.
>>>>
>>>> As a morning tea reading I suggest to read US Patent 5,533,051 on
>>>> compression of random data.
>>>> It claims that it will compress two arbitrary bits into one bit.
>>>>
>>>> http://gailly.net/05533051.html
>>>>
>>>>
>>>> Andrew Fedoniouk.
>>>> http://terrainformatica.com
>>>
>>> Come now... I'm sure he's talking about the average case, not the worst
>>> case.
>>
>> I don't think it matters. No matter the algorithm, you must examine
>> every element of the search string. It's just that typical searches are
>> actually worse than O(N), as they often examine each element more than
>> once.
>
> You have to examine every element of the search string? How come I haven't
> heard that? :)
>
> Try (approximately - my C(++) experience shows too much):
>
> char *needleInHaystack(char *haystack, char needle)
> {
> if( !haystack )
> return null;
>
> for( char *loc = haystack; *loc != '\0'; loc++ )
> if( *loc == needle )
> return loc;
>
> return null;
> }
>
> This definitely does not examine every element of the string, except in
> the worst-case scenario - just enough to find the first occurence, no
> matter what the string is or what character the needle is.
>
Lets assume that here unordered set (string) has flat distribution of
element(chars) occurences.
You can cry, pray, read C++ books, whatever you like, but it will take you
N/2 comparisons average
and in the worst case N comparisons to find first occurence of something
there.
The only thing which affect N/2 average is the quality of your random
generator (flatness of distribution).
And this quality is usual reason of "wonders" in such discoveries.
Andrew Fedoniouk.
http://terrainformatica.com
More information about the Digitalmars-d
mailing list