The D Programming Language Vision Document
Dukc
ajieskola at gmail.com
Wed Jul 6 21:30:44 UTC 2022
On Sunday, 3 July 2022 at 20:16:35 UTC, Ola Fosheim Grøstad wrote:
> On Sunday, 3 July 2022 at 19:32:56 UTC, rikki cattermole wrote:
>> It is required for string equivalent comparisons (which is
>> what you should be doing in a LOT more cases! Anything user
>> provided when compared should be normalized first.
>
> Well, I think it is reasonable for a protocol to require that
> the input is NFC, and just check it and reject it or call out
> to an external library to convert it into NFC.
>
> Anyway, UTF-8 is the only format that isn't affected by network
> byte order… So if you support more than UTF-8 then you have to
> support UTF-8, UTF16-LE, UTF16-BE, UTF-32LE, UTF-32BE…
It is pretty easy to convert those to native endian and back with
functions in `std.bitmanip`. I recently did so to have a program
to recognise files in all of those five.
Also the Phobos functions are of high quality. They work
extremely well with the range API (other than having to live with
autodecoding), they are well documented and they are
comprehensive enough for almost any task. I don't recall having
ever considered another library for handling Unicode.
And I think there is still pretty much value in handling UTF-16
strings because that's what many other languages use. With the
current vision, Phobos V2 won't handle UTF16 in place. We'll have
to convert it to UTF8 before manipulation, which is probably not
optimal. And if the string functions have to deal with two
formats anyway, also supporting UTF32 on top of them probably
does not make much difference.
That said, I don't feel strongly about this because if we kick
UTF16 and UTF32 functions out of Phobos, they still are
presumably available in Undead.
More information about the Digitalmars-d-announce
mailing list