Top 5

Benji Smith dlanguage at benjismith.net
Sat Oct 11 11:46:55 PDT 2008


Sascha Katzner wrote:
> Sergey Gromov wrote:
>> This is the whole point.  The benchmark is valid because it performs
>> the same *task*, and the task is somewhat close to real world.  It
>> measures *time*, which is universal.  The compared languages use
>> different approaches and techniques to achieve the goal, that's why
>> benchmark is useful.  It allows to justify usefulness of these
>> languages for a particular class of tasks.
> 
> My point was, that it is *not* the same task both programs perform. The
> D version has to do a lot more because it accounts for multi-byte
> codepoints in UTF8, but the Java version doesn't account for surrogate
> pairs. I bet if you simply scan byte-wise through the D UTF8 array for
> whitespaces without converting them to UTF32 it would perform even
> better, but that wouldn't be a fair comparison neither. ;-)
> 
> It's like if you would remove all runtime security checks and exception 
> code from a programm and benchmark it against the original version... it 
> simply doesn't make much sense. ;-)

And my whole point was that Java's design decision to always use 
two-byte characters is a superior choice, since performance is not an 
issue, and since having a single character type makes the programmer's 
life a helluva lot simpler.

The D design makes things pointlessly complex, and now you want brownie 
points for dealing with that pointless complexity?

And, btw, you *can't* scan bytewise through a D string to find space 
characters, because the value '32' can occur as the 
least-significant-byte in a multi-byte non-whitespace character. Any 
code that iterates bytewise through a char[] array is fundamentally broken.

But D's strings *look* like they can be iterated byte-by-byte, because 
they're arrays. And all other kinds of arrays in D can be iterated that 
way. You can't retrieve a long value from an int array, because it 
doesn't make sense. And it doesn't make sense to foreach through a 
collection of dchars in a char[] array.

The purpose of this benchmark is not to show Java's speed advantage 
(because my primary concern with string processing is not speed). The 
purpose was to show that the speed justifications for D's wonky design 
are not valid.

D strings are a trainwreck not because of a few milliseconds of 
execution time. They're a trainwreck because they break the rules of the 
language.

--benji



More information about the Digitalmars-d mailing list