D1: UTF8 char[] casting to wchar[] array cast misalignment ERROR

jicman via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Tue Jun 17 06:35:50 PDT 2014


On Tuesday, 17 June 2014 at 12:54:39 UTC, Marc Schütz wrote:
> On Tuesday, 17 June 2014 at 02:27:43 UTC, jicman wrote:
>>
>> Greetings!
>>
>> I have a bunch of files plain ASCII, UTF8 and UTF16 with and 
>> without BOM (Byte Order Mark).  I had, "I thought", a nice way 
>> of figuring out what type of encoding the file was (ASCII, 
>> UTF8 or UTF16) when the BOM was missing, by reading the 
>> content and applying the std.utf.validate function to the 
>> char[] or, wchar[] string.  The problem is that lately, I am 
>> hitting into a wall with the "array cast misalignment" when 
>> casting wchar[].
>> ie.
>>
>> auto text = cast(string) file.read();
>> wchar[] temp = cast(wchar[]) text;
>
> If the length of the data is odd, it cannot be (valid) UTF16. 
> You can check for that, and skip the test for UTF16 in this 
> case.
>
> Another thing: it is better not to cast the data to `string` 
> before you know that it's actually UTF8. Better make it 
> `ubyte[]`; this way you don't need all the casts inside the 
> if-blocks.

Indeed.  Thanks.


More information about the Digitalmars-d-learn mailing list