byte to char safe?

Sergey Gromov snake.scaly at gmail.com
Fri Jul 31 07:21:44 PDT 2009


Thu, 30 Jul 2009 19:14:56 -0400, Harry wrote:

> Ary Borenszweig Wrote:
> 
>> Harry escribi��> > Again hello, 
>>> 
>>> char[6] t = r"again" ~ cast(char)7 ~ r"hello";
>> 
>> If you want the result to be "again7hello", then no. You must do:
>> 
>> char[6] t = r"again" ~ '7' ~ r"hello";
>> 
>> or:
>> 
>> char[6] t = r"again" ~ (cast(char)('0' + 7)) ~ r"hello";
> 
> Hello Ary,
> 
> 7 is data not string.
> It makes own write function
> need style data in char[]
> Not sure if safe ?

If you use only your own write function then you can put just anything
into char[].  But if you pass that char[] to any standard function, or
even foreach, and there are non-UTF-8 sequences in there, the standard
function will fail.

Also note that values from 0 to 0x7F are valid UTF-8 codes and can be
safely inserted into char[].

If you want to safely put a larger constant into char[] you can use
unicode escape sequences: '\uXXXX' or '\UXXXXXXXX', where XXXX and
XXXXXXXX are 4 or 8 hexadecimal digits respectively:

    char[] foo = "hello " ~ "\u017e" ~ "\U00105614";
    foreach (dchar ch; foo)
        writefln("%x", cast(uint) ch);

Finally, if you want to encode a variable into char[], you can use
std.utf.encode function:

    char[] foo;
    uint value = 0x00100534;
    std.utf.encode(foo, value);

Unfortunately all std.utf functions accept only valid UTF characters.
Currently they're everything from 0 to 0xD7FF and from 0xE000 to
0x10FFFF.  Any other character values will throw a run-time exception if
passed to standard functions.


More information about the Digitalmars-d-learn mailing list