The D Programming Language Vision Document
Ola Fosheim Grøstad
ola.fosheim.grostad at gmail.com
Sun Jul 3 21:01:49 UTC 2022
On Sunday, 3 July 2022 at 20:28:18 UTC, rikki cattermole wrote:
> We only support UTF-16/UTF-32 for the target endian.
>
> Text input comes from many sources, stdin, files and say the
> windowing system are three common sources that do not make any
> such guarantees.
Well, then the application author will use an external Unicode
library anyway. If you support UTF-16 or UTF-32 there might not
be a BOM mark, so you might need to use heuristics to figure out
the LE/LB endian issue.
For things like gzip, png, crypto and unicode there are most
likely faster and better tested open source alternatives than a
small community can come up with. Maybe just use out whatever
Chromium or Clang uses?
What I never liked about C++ is the string mess: char, signed
char, unsigned char, char8_t, char16_t, char32_t, wchar_t,
string, wstring, u8string, u16string, u32string, pmr::string,
pmr::wstring, pmr::u8string, pmr::u16string, pmr::u32string… And
this doesn't even account for endianess!! This is what happens
over time as new needs pops up. One of the best things about
Python3 and JavaScript is that there is one commonly used string
type that is well supported.
Having one common string representation is a good thing for API
authors.
(But make sure to have a maintained binding to a versatile C
unicode library.)
More information about the Digitalmars-d-announce
mailing list