Dicebot on leaving D: It is anarchy driven development in all its glory.

Chris wendlec at tcd.ie
Thu Sep 6 13:09:57 UTC 2018


On Thursday, 6 September 2018 at 11:43:31 UTC, ag0aep6g wrote:
>
> You say that D users shouldn't need a '"Unicode license" before 
> they do anything with strings'. And you say that Python 3 gets 
> it right (or maybe less wrong than D).
>
> But here we see that Python requires a similar amount of 
> Unicode knowledge. Without your Unicode license, you couldn't 
> make sense of `len` giving different results for two strings 
> that look the same.
>
> So both D and Python require a Unicode license. But on top of 
> that, D also requires an auto-decoding license. You need to 
> know that `string` is both a range of code points and an array 
> of code units. And you need to know that `.length` belongs to 
> the array side, not the range side. Once you know that (and 
> more), things start making sense in D.

You'll need some basic knowledge of Unicode, if you deal with 
strings, that's for sure. But you don't need a "license" and it 
certainly shouldn't be used as an excuse for D's confusing nature 
when it comes to strings. Unicode is confusing enough, so you 
don't need to add another layer of complexity to confuse users 
further. And most certainly you shouldn't blame the user for 
being confused. Afaik, there's no warning label with an 
accompanying user manual for string handling.

> My point is: D doesn't require more Unicode knowledge than 
> Python. But D's auto-decoding gives `string` a dual nature, and 
> that can certainly be confusing. It's part of why everybody 
> dislikes auto-decoding.

D should be clear about it. I think it's too late for `string` to 
change its behavior (i.e. "á".length = 1). If you wanna change 
`string`'s behavior now, maybe a compiler switch would be an 
option for the transition period: -autodecode=off.

Maybe a new type of string could be introduced that behaves like 
one would expect, say `ustring` for correct Unicode handling. Or 
`string` does that and you introduce a new type for high 
performance tasks (`rawstring` would unfortunately be confusing).

The thing is that even basic things like string handling are 
complicated and flawed so that I don't want to use D for any 
future projects and I don't have the time to wait until it gets 
fixed one day, if it ever will get fixed that is. Neither does it 
seem to be a priority as opposed to other things that are maybe 
less important for production. But at least I'm wiser after this 
thread, since it has been made clear that things are not gonna 
change soon, at least not soon enough for me.

This is why I'll file for D-vorce :) Will it be difficult? Maybe 
at the beginning, but it will make things easier in the long run. 
And at the end of the day, if you have to fix and rewrite parts 
of your code again and again due to frequent language changes, 
you might as well port it to a different PL altogether. But I 
have no hard feelings, it's a practical decision I had to make 
based on pros and cons.

[snip]




More information about the Digitalmars-d mailing list