D1: UTF8 char[] casting to wchar[] array cast misalignment ERROR

Jesse Phillips via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Tue Jun 17 19:25:33 PDT 2014


On Tuesday, 17 June 2014 at 02:27:43 UTC, jicman wrote:
>
> Greetings!
>
> I have a bunch of files plain ASCII, UTF8 and UTF16 with and 
> without BOM (Byte Order Mark).  I had, "I thought", a nice way 
> of figuring out what type of encoding the file was (ASCII, UTF8 
> or UTF16) when the BOM was missing, by reading the content and 
> applying the std.utf.validate function to the char[] or, 
> wchar[] string.  The problem is that lately, I am hitting into 
> a wall with the "array cast misalignment" when casting wchar[].

If the BOM is missing and it is not UTF-8, it isn't a valid UTF 
encoding.

Otherwise you have your answer. Don't cast a char[] to wchar[], 
if you have a valid char[] then it must be converted (use 
std.conv.to);

Some testing, the mentioned check for UTF-16 being even is 
exactly what caused the "array cast misalignment" error (the 
array wasn't an even number of bytes).


More information about the Digitalmars-d-learn mailing list