byte to char safe?

Sat Aug 1 16:58:20 PDT 2009

Sergey Gromov Wrote:

> Thu, 30 Jul 2009 19:14:56 -0400, Harry wrote:
> 
> > Ary Borenszweig Wrote:
> > 
> >> Harry escribií¦ í»Š> > Again hello, 
> >>> 
> >>> char[6] t = r"again" ~ cast(char)7 ~ r"hello";
> >> 
> >> If you want the result to be "again7hello", then no. You must do:
> >> 
> >> char[6] t = r"again" ~ '7' ~ r"hello";
> >> 
> >> or:
> >> 
> >> char[6] t = r"again" ~ (cast(char)('0' + 7)) ~ r"hello";
> > 
> > Hello Ary,
> > 
> > 7 is data not string.
> > It makes own write function
> > need style data in char[]
> > Not sure if safe ?
> 
> If you use only your own write function then you can put just anything
> into char[].  But if you pass that char[] to any standard function, or
> even foreach, and there are non-UTF-8 sequences in there, the standard
> function will fail.
> 
> Also note that values from 0 to 0x7F are valid UTF-8 codes and can be
> safely inserted into char[].
> 
> If you want to safely put a larger constant into char[] you can use
> unicode escape sequences: '\uXXXX' or '\UXXXXXXXX', where XXXX and
> XXXXXXXX are 4 or 8 hexadecimal digits respectively:
> 
>     char[] foo = "hello " ~ "\u017e" ~ "\U00105614";
>     foreach (dchar ch; foo)
>         writefln("%x", cast(uint) ch);
> 
> Finally, if you want to encode a variable into char[], you can use
> std.utf.encode function:
> 
>     char[] foo;
>     uint value = 0x00100534;
>     std.utf.encode(foo, value);
> 
> Unfortunately all std.utf functions accept only valid UTF characters.
> Currently they're everything from 0 to 0xD7FF and from 0xE000 to
> 0x10FFFF.  Any other character values will throw a run-time exception if
> passed to standard functions.

thank you!

non-print utf8 is print with writef
start of text \x02 is smile
end of text \x03 is heart
newline \x0a is newline!
is difference? utf.encode(foo,value)  foo~"\U00100534"