Reading binary streams with decoding to Unicode

Nicholas Wilson iamthewilsonator at hotmail.com
Mon Oct 15 19:56:22 UTC 2018


On Monday, 15 October 2018 at 18:57:19 UTC, Vinay Sajip wrote:
> On Monday, 15 October 2018 at 17:55:34 UTC, Dukc wrote:
>> This is done automatically for character arrays, which 
>> includes strings. wchar arrays wil iterate by UTF-16, and 
>> dchar arrays by UTF-32. If you have a byte/ubyte array you 
>> know to be unicode-encoded, convert it to char[] to iterate by 
>> code points.
>
> Thanks for the response. I was looking for something where I 
> don't have to manage buffers myself (e.g. when handling 
> buffered file or socket I/O). It's really easy to find this 
> functionality in e.g. Python, C#, Go, Kotlin, Java etc. but I'm 
> surprised there doesn't seem to be a ready-to-go equivalent in 
> D. For example, I can find D examples of opening files and 
> reading a line at a time, but no examples of opening a file and 
> reading Unicode chars one at a time. Perhaps I've just missed 
> them?

import std.file : readText;
import std.uni : byCodePoint, byGrapheme;
// or import std.utf : byCodeUnit, byChar /*utf8*/, byWchar 
/*utf16*/, byDchar /*utf32*/, byUTF  /*utf8(?)*/;
string a = readText("foo");

foreach(cp; a.byCodePoint)
{
     // do stuff with code point 'cp'
}

foreach(g; a.byGrapheme)
{
     // do stuff with grapheme 'g'
}



More information about the Digitalmars-d-learn mailing list