Publicity (was: Re: Why is 2.0 in the works already?)

David B. Held dheld at codelogicconsulting.com
Fri Jun 29 00:00:13 PDT 2007


Lars Ivar Igesund wrote:
> Anders Bergh wrote:
> 
>> On 6/19/07, Lars Ivar Igesund <larsivar at igesund.net> wrote:
>>> The only reason to worry about TIOBE, is that high rankings may boost
>>> knowledge about D. As is rather obvious by looking at the list, D's
>>> numbers are most likely highly inflated, and a reason for this is
>>> suggested at
>>>
>>> http://cdsmith.wordpress.com/2007/06/18/is-tiobe-fatally-flawed/
>> I just read that post, and scrolling down to the comments makes the
>> post less notable. Google apparently cuts results at 1000 results, and
>> removes duplicates, shrinking the numbers which caused his search
>> results to be even more flawed than TIOBE's.
> 
> Yes, his ranking was definately wrong too, don't necessarily make TIOBE's
> more correct though :) If TIOBE use Google, the argumentation would affect
> them too in some form.

TIOBE's rankings are certainly suspect, but all the hoopla about Google 
is just wrong.  Google does not remove "duplicate" hits, because it does 
not index duplicate hits.  That would be a stupid search engine. 
Instead, it removes pages that look like they came from the same site 
and possibly the same area of a site, and thus, may not present 
interesting new information to the user.  A simple example is searching 
for a term that happens to be on the footer of a bunch of pages on a 
site.  The hits are not "dupes", but they aren't interesting, either.

Clearly, assuming that all result sets are < 1000 is just silly, and the 
blogger should have known better.  The estimated Google hit counts are 
probably accurate within an order of magnitude, based on various 
searches I've done where I compared the initial hit count to what Google 
  says after I've forced it to do an exact count (by visiting all the 
pages).  So the TIOBE page counts are probably fairly reasonable.  What 
is not reasonable is any interpretation of those results that mentions 
"popularity", "buzz", "community", or "zeitgeist".  Even less reasonable 
is any assumption that languages near the top of the list are "better" 
than those not near the top for anything but a narrow and specific 
definition of "better".

Dave



More information about the Digitalmars-d mailing list