char array weirdness

H. S. Teoh via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Tue Mar 29 16:42:07 PDT 2016


On Tue, Mar 29, 2016 at 11:15:26PM +0000, Basile B. via Digitalmars-d-learn wrote:
> On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote:
> >void main () {
> >    import std.range.primitives;
> >    char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8', 's'];
> >    pragma(msg, ElementEncodingType!(typeof(val)));
> >    pragma(msg, typeof(val.front));
> >}
> >
> >prints
> >
> >    char
> >    dchar
> >
> >Why?
> 
> I've seen you so many time as a reviewer on dlang that I belive this Q
> is a joke.
> Even if obviously nobody can know everything...
> 
> https://www.youtube.com/watch?v=l97MxTx0nzs
> 
> seriously you didn't know that auto decoding is on and that it gives
> you a dchar...

Believe it or not, it was only last year (IIRC, maybe the year before)
that Walter "discovered" that Phobos does autodecoding, and got pretty
upset over it.  If even Walter wasn't aware of this for that long...

I used to be in favor of autodecoding, but more and more, I'm seeing
that it was a bad choice.  It's a special case to how ranges normally
work, and this special case has caused a ripple of exceptional corner
cases to percolate throughout all Phobos code, leaving behind a
string(!) of bugs over the years that, certainly, eventually got
addressed, but nevertheless it shows that something didn't quite fit in.
It also left behind a trail of additional complexity to deal with these
special cases that made Phobos harder to understand and maintain.

It's a performance bottleneck for string-processing code, which is a
pity because D could have stood the chance to win against C/C++ string
processing (due to extensive need to call strlen and strdup). But in
spite of this heavy price we *still* don't guarantee correctness. On the
spectrum of speed (don't decode at all) vs. correctness (segment by
graphemes, not by code units or code points) autodecoding lands in the
anemic middle where you get neither speed nor full correctness.

The saddest part of it all is that this is unlikely to change because
people have gotten so uptight about the specter of breaking existing
code, in spite of the repeated experiences of newbies (and
not-so-newbies like Walter himself!) wondering why strings have
ElementType == dchar instead of char, usually followed by concerns over
the performance overhead.


T

-- 
Designer clothes: how to cover less by paying more.


More information about the Digitalmars-d-learn mailing list