Top 5
Benji Smith
dlanguage at benjismith.net
Sat Oct 11 11:46:55 PDT 2008
Sascha Katzner wrote:
> Sergey Gromov wrote:
>> This is the whole point. The benchmark is valid because it performs
>> the same *task*, and the task is somewhat close to real world. It
>> measures *time*, which is universal. The compared languages use
>> different approaches and techniques to achieve the goal, that's why
>> benchmark is useful. It allows to justify usefulness of these
>> languages for a particular class of tasks.
>
> My point was, that it is *not* the same task both programs perform. The
> D version has to do a lot more because it accounts for multi-byte
> codepoints in UTF8, but the Java version doesn't account for surrogate
> pairs. I bet if you simply scan byte-wise through the D UTF8 array for
> whitespaces without converting them to UTF32 it would perform even
> better, but that wouldn't be a fair comparison neither. ;-)
>
> It's like if you would remove all runtime security checks and exception
> code from a programm and benchmark it against the original version... it
> simply doesn't make much sense. ;-)
And my whole point was that Java's design decision to always use
two-byte characters is a superior choice, since performance is not an
issue, and since having a single character type makes the programmer's
life a helluva lot simpler.
The D design makes things pointlessly complex, and now you want brownie
points for dealing with that pointless complexity?
And, btw, you *can't* scan bytewise through a D string to find space
characters, because the value '32' can occur as the
least-significant-byte in a multi-byte non-whitespace character. Any
code that iterates bytewise through a char[] array is fundamentally broken.
But D's strings *look* like they can be iterated byte-by-byte, because
they're arrays. And all other kinds of arrays in D can be iterated that
way. You can't retrieve a long value from an int array, because it
doesn't make sense. And it doesn't make sense to foreach through a
collection of dchars in a char[] array.
The purpose of this benchmark is not to show Java's speed advantage
(because my primary concern with string processing is not speed). The
purpose was to show that the speed justifications for D's wonky design
are not valid.
D strings are a trainwreck not because of a few milliseconds of
execution time. They're a trainwreck because they break the rules of the
language.
--benji
More information about the Digitalmars-d
mailing list