Safer casts

Janice Caron caron800 at googlemail.com
Sat May 10 04:40:18 PDT 2008


On 10/05/2008, terranium <spam at here.lot> wrote:
> Janice Caron Wrote:
>
>  > Consider this:
>  >
>  >     string s = "\u20AC"; /* s contains exactly one Unicode character */
>  >     string t = s[1..2];
>
> this makes string array of bytes rather than chars.

You are incorrect. Indisputably, typeof(t) == invariant(char)[]. It is
an array of invariant chars - that is, an array of invariant UTF-8
code units. Each code unit is individually valid, but the complete
string consists of two malformed code unit sequences, each of which is
an isolated continuation byte.

You are also missing the point. This thread is about casting, not
Unicode. If you want to talk Unicode, I'm happy to do so, but please
let's take that to another thread. I only brought up slicing as an
example of why low level stuff must be permitted, and in specific
response to a point made by Yigal.



More information about the Digitalmars-d mailing list