Probabilistic search - psearch.tar.gz

Georg Wrede georg.wrede at nospam.org
Tue Apr 4 15:47:01 PDT 2006


Point 1: we do have to allow for the different expression of content and 
intent that arises from writers, who are from a fundamentally different 
cultural background. (Possibly this is made even harder if such a person 
uses some automatic translation between languages.)

So, I vote for understanding, meditation and deep thought, before 
dismissing the ideas presented.

Point 2: there are several "test cases" for algorithms. One is "worst 
case", another is "truly random data", another is "best case", yet 
another is "average case", and finally we have "plausible data". (There 
are others, I know.)

Since _no_ interesting data is truely random (because by definition, 
such data does not contain Information!), we can dismiss the "truly 
random data" case.

Now, depending on the application domain, it may or may not be 
significant what the "worst case" performance is. Similarly, "best case" 
usually doesn't count, but it's weight has to be evaluated on a case by 
case basis.

At this point we are between the "average case" and "plausible data". 
Here we actually can find surprising results, if we look hard enough. 
For all practical purposes, the algorithm which performs best _given_ 
"actual like" data, would be the one to choose. (Assuming we are even 
able to present such "actual like" data in the first place.)

---

In summary, instead of quibbling over theoretical merits of this 
particular method, we might do better by trying to evaluate which kind 
of data this algorithm may perform best with, and thereafter compare it 
to those generally suggested for that kind of data.


sundaresh wrote:
> FYI, there is absolutely nothing wrong with that statement. It is 
> certainly not a bug, if there is one, and I am not ruling out that 
> possibility.comparisons is never decremented or decreased, so how can
> it be reduced to the value of 9, if it increases above 9, as you
> claim/assume that it supposedly does. The value of comparisons is set
> prior to any/all recursive calls to psearch. If on the other hand,
> you claim that it is possibly the characteristic of the random number
> generator, since it makes the decision of which half to search for
> first, but then again, that decision is binary and I think that any
> random number generator should generate as many even numbers as odd
> numbers, at least for a reasonably sized sample.
> 
> I am not one to contest theory, and if this algorithm is correct,
> then there probably is a very valid and sound theoretical explanation
> as to why it works, but I am not one to reject evidence either, in
> the face of it, especially of such an astounding nature. It is well
> acknowledged that empirical testing of methods lead to a convincing
> degree of assurance in them and is a reasonable way to assess their
> reliability.
> 
> My apologies if I have irked you.
> 
> To have an open mind, while looking at it is all I asked for.
> 
> The reason I posted it here was because I was not sure myself.
> 
> 
> 
> In article <e0u7jg$1n3i$1 at digitaldaemon.com>, Oskar Linde says...
> 
>> sundaresh skrev:
>> 
>>> I was told to post here by walter. I did think of posting it in
>>> the C newgroup but felt otherwise, since D is apparently upward
>>> compatible with C, or the specs in the site told me, so I
>>> presumed that every C program was a D program. As far as proof
>>> goes, I have included a test program. The results clearly 
>>> indicate that for an array of size 512, of random numbers, with
>>> 2048 trials of different distibutions, and for each trial upto
>>> 2048 random selections of elements in the array to lookup, the
>>> number of comparisons is a constant at 9 or lg(512). Please run
>>> the test program. If it is a bug and you can point it out, I
>>> would be indebted. But that was why I posted it here. If it is'nt
>>> a bug, I was hoping it would be my contribution.
>>> 
>>> I was hoping for an enlightened response.
>>> 
>>> You can lie all you want, but that does'nt change the truth. The
>>> sun does rise in the east. There are only 4 seasons in a year,
>>> and the value of pie is a fixed.
>>> 
>>> I would encourage users to please try it out and respond. The
>>> proof of the pudding is in the eating.
>>> 
>>> Not many experts here walter.I am dissapointed.
>> 
>> I'm not sure if this is intended as a joke or not, but I am sorry
>> to tell that your code is not logarithmic. It is easy to prove that
>> there are no better algorithm than O(n) for searching an unordered
>> array on a deterministic single thread. I guess the error in your
>> code is the computation of the number of comparisons:
>> 
>> comparisons = ++depth;
>> 
>> Cheers,
>> 
>> /Oskar
> 
> 
> 



More information about the Digitalmars-d mailing list