What are the worst parts of D?
H. S. Teoh via Digitalmars-d
digitalmars-d at puremagic.com
Thu Sep 25 21:35:13 PDT 2014
On Fri, Sep 26, 2014 at 04:05:18AM +0000, Vladimir Panteleev via Digitalmars-d wrote:
> On Thursday, 25 September 2014 at 21:49:43 UTC, H. S. Teoh via Digitalmars-d
> wrote:
> >It's not just about performance.
>
> Something I recently realized: because of auto-decoding,
> std.algorithm.find("foo", 'o') cannot be implemented using memchr. I
> think this points to a huge design fail, performance-wise.
Well, if you really want to talk performance, we've already failed. Any
string operation that starts from a narrow string and ends with a narrow
string (of the same width) will incur the overhead of decoding /
reencoding *every single character*, even if it's mostly redundant.
What bugs me even more is the fact that every single Phobos algorithm
that might conceivably deal with characters has to be special-cased for
narrow string in order to be performant. That's a mighty high price to
pay for what's a relatively small benefit -- note that autodecoding does
*not* guarantee Unicode correctness, even if, according to the argument
of some, it helps. So we're paying a high price in terms of performance
and code maintainability in Phobos, for the dubious benefit of only
partial Unicode conformance.
> There are also subtle correctness problems:
> haystack[0..haystack.countUntil(needle)] is wrong, even if it works
> right on ASCII input.
>
> For once I agree with Walter Bright - regarding the chair throwing :)
Not to mention that autodecoding *still* doesn't fix the following
problem:
assert("á".canFind("á")); // fails
(Note: you may need to save this message verbatim and edit it into a D
source file to see this effect; cut-n-paste on some systems may erase
the effect.)
And the only way to fix this would be so prohibitively expensive, I
don't think even Andrei would agree to it. :-P
So basically, we're paying (1) lower performance, (2) non-random access
for strings, (3) subtle distinction between index and count and other
such gotchas, and (4) tons of special-cased Phobos code with the
associated maintenance costs, all for incomplete Unicode correctness.
Doesn't seem like the benefit measures up to the cost. :-(
T
--
We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true. -- Robert Wilensk
More information about the Digitalmars-d
mailing list