Dicebot on leaving D: It is anarchy driven development in all its glory.
aliak
something at something.com
Thu Sep 6 21:15:59 UTC 2018
On Thursday, 6 September 2018 at 20:15:22 UTC, Jonathan M Davis
wrote:
> On Thursday, September 6, 2018 1:04:45 PM MDT aliak via
> Digitalmars-d wrote:
>> D makes the code-point case default and hence that becomes the
>> simplest to use. But unfortunately, the only thing I can think
>> of
>> that requires code point representations is when dealing
>> specifically with unicode algorithms (normalization, etc).
>> Here's
>> a good read on code points:
>> https://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-un
>> icode-code-points/ -
>>
>> tl;dr: application logic does not need or want to deal with
>> code points. For speed units work, and for correctness,
>> graphemes work.
>
> I think that it's pretty clear that code points are objectively
> the worst level to be the default. Unfortunately, changing it
> to _anything_ else is not going to be an easy feat at this
> point. But if we can first ensure that Phobos in general
> doesn't rely on it (i.e. in general, it can deal with ranges of
> char, wchar, dchar, or graphemes correctly rather than assuming
> that all ranges of characters are ranges of dchar), then maybe
> we can figure something out. Unfortunately, while some work has
> been done towards that, what's mostly happened is that folks
> have complained about auto-decoding without doing much to
> improve the current situation. There's a lot more to this than
> simply ripping out auto-decoding even if every D user on the
> planet agreed that outright breaking almost every existing D
> program to get rid of auto-decoding was worth it. But as with
> too many things around here, there's a lot more talking than
> working. And actually, as such, I should probably stop
> discussing this and go do something useful.
>
> - Jonathan M Davis
Is there a unittest somewhere in phobos you know that one can be
pointed to that shows the handling of these 4 variations you say
should be dealt with first? Or maybe a PR that did some of this
work that one could investigate?
I ask so I can see in code what it means to make something not
rely on autodecoding and deal with ranges of char, wchar, dchar
or graphemes.
Or a current "easy" bugzilla issue maybe that one could try a
hand at?
More information about the Digitalmars-d
mailing list