Top 5
Benji Smith
dlanguage at benjismith.net
Sat Oct 11 11:29:51 PDT 2008
Sascha Katzner wrote:
> Benji Smith wrote:
>> Actually, when it comes to string processing, D is decidedly *not* a
>> "performance language".
>>
>> Compared to...say...Java (which gets a bum rap around here for being
>> slow), D is nothing special when it comes to string processing speed.
>>
>> I've attached a couple of benchmarks, implemented in both Java and D
>> (the "shakespeare.txt" file I'm benchmarking against is from the
>> Gutenburg project. It's about 5 MB, and you can grab it from here:
>> http://www.gutenberg.org/dirs/etext94/shaks12.txt )
>>
>> In some of those benchmarks, D is slightly faster. In some of them,
>> Java is a lot faster. Overall, on my machine, the D code runs in about
>> 12.5 seconds, and the Java code runs in about 2.5 seconds.
>>
>> Keep in mind, all java characters are two-bytes wide. And you can't
>> access a character directly. You have to retrieve it from the String
>> object, using the charAt() method. And splitting a string creates a
>> new object for every fragment.
>>
>> I admire the goal in D to be a performance language, but it drives me
>> crazy when people use performance as justification for an inferior
>> design, when other languages that use the superior design also
>> accomplish superior performance.
>
> I think your benchmark is not very meaningful. Without going into
> implementation details of Tango (because I don't use Tango) here are
> some notes:
>
> - The D version uses UTF8 strings whereas the Java version uses
> "wanna-be" UTF16 (Java has a lot of problems with surrogates). This
> means you are comparing apples with pears (D has to *parse* an UTF8
> string and Java simply uses an wchar array without proper surrogate
> handling in *many* cases).
>
> - At least in runCharIterateTest() you also convert the D UTF8 string
> also additionally into an UTF32 string, in the Java version you did not
> do this.
>
> - The StringBuilder in the Java version is *much* faster because it
> doesn't have to allocate a new memory block in each step. You can use a
> similar class in D too, without the need of a special string class/object.
>
> ...
>
> LLAP,
> Sascha
Nonsense!
The benchmark is valid because I use the best string processing tools
that each language provides. If D had anything like a StringBuilder, I
would use it. If D had any way of iterating over the characters in a
string without converting them to UTF-32, I'd use that too.
People argue that D string processing uses these funky idioms for
performance reasons, and that using a more elegant design, with objects
and polymorphism would be hopelessly slow. I'm just showing that those
idioms don't actually provide the performance that people claim.
--benji
More information about the Digitalmars-d
mailing list