String implementations

Sun Jan 20 12:03:07 PST 2008

On 1/20/08, James Dennett <jdennett at acm.org> wrote:
> Looks very different to me.

I thought it looked very similar indeed to D, but there you go. Funny
how two different people can read the same document and interpret it
in two different ways.

> There's no conflation of char with a
> code unit of UTF8

C has no ubyte type. Since time immemorial, C programmers have been
using the char type to store every 8-bit wide data type under the sun
simply because there's been no alternative (until recently, when
int8_t showed up as a typedef for char). That's not a big deal.

> (and indeed C++ deliberately supports use of
> varied encodings for multi-byte characters).

I must have misread the heading that says "Require UTF", and whose
text reads "The C TR makes the encoding of char16_t and char32_t
implementation-defined. It also provides macros to indicate whether or
not the encoding is UTF. In contrast, this proposal requires UTF
encoding."

Oh, I see what you're saying - C++ would require UTF for wchar and
dchar, but not for char. Well, that's historical legacy for you.

> Yes, C++ is adding
> 16- and 32-bit character types which are more akin to D's, but that
> has little bearing on how differently it handles multi-byte (as
> opposed to wide-character) strings.

So it has a bunch of procedural functions instead of foreach. Apart
from that, the approach seems the same as D. Where's the difference?