UTF-8 strings and endianness

denizzzka 4denizzz at gmail.com
Mon Oct 29 10:08:57 PDT 2012


On Monday, 29 October 2012 at 15:46:43 UTC, Jordi Sayol wrote:
> Al 29/10/12 16:17, En/na denizzzka ha escrit:
>> Hi!
>> 
>> How to convert D's string to big endian?
>> How to convert to D's string from big endian?
>> 
>> 
>
> UTF-8 is always big emdian.

oops, what?

Q: Is the UTF-8 encoding scheme the same irrespective of whether 
the underlying processor is little endian or big endian?

A: Yes. Since UTF-8 is interpreted as a sequence of bytes, there 
is no endian problem as there is for encoding forms that use 
16-bit or 32-bit code units. Where a BOM is used with UTF-8, it 
is only used as an ecoding signature to distinguish UTF-8 from 
other encodings — it has nothing to do with byte order.


More information about the Digitalmars-d-learn mailing list