The Computer Languages Shootout Game
sybrandy
sybrandy at gmail.com
Tue Nov 2 11:07:44 PDT 2010
> It doesn't matter whether they try to confirm the measurements for themselves or not - what matters is that they are provided with the all the information required to do so.
>
>
> I only have 5 years experience publishing the measurements for the benchmarks game - and I've come across a handful of people who did try to confirm the measurements for themselves.
>
> (The most interesting example compared a couple of language implementations on one particular task but measured at 2 dozen different input values. That nicely demonstrated that the same language implementation wasn't always faster across all the input values. The 3 different input values shown on the benchmarks game isn't usually enough to demonstrate that kind of thing.)
>
That's an interesting observation. I didn't even think of that before,
but it does make sense.
I was debating on posting this, but I figured it couldn't hurt: the
biggest problem I have with the benchmarks they use is that, at least
from my perspective, they're not all very common algorithms. Some
things I'd love to see are B-Trees, which are common in databases,
encryption, compression, etc. as they are very common and therefore
provide more useful comparisons. Even MapReduce would be good since
that's becoming very popular.
Taking it a step further, there needs to be well-defined standard
implementations and alternative implementations. The standard
implementations would be designed to be straight-forward designs that
don't use any trickery so that we can actually compare language
implementations. The alternative ones would then show how you can make
the implementations faster. I mention this because a buddy of mine
submitted a C version of one benchmark, but implemented his own thread
pooling code. It was rejected even though the C++ version used Boost,
which also, from what I'm told, uses thread pooling. A standard
implementation could be used to define if things like thread pooling
can/should be used. I'd argue not in this case as not every language
supports and/or requires it. E.g. Erlang.
Of course, this is all just some ideas that I'm not going to try to
implement as it's just going to be too much work to do and I don't have
the resources to do it right. Even then, how do we make it truly fair
and accurate? Based on what I've seen in this thread, it's a pretty
hard problem if even the data can affect a languages performance.
Casey
More information about the Digitalmars-d
mailing list