Reading binary streams with decoding to Unicode
Nicholas Wilson
iamthewilsonator at hotmail.com
Mon Oct 15 19:56:22 UTC 2018
On Monday, 15 October 2018 at 18:57:19 UTC, Vinay Sajip wrote:
> On Monday, 15 October 2018 at 17:55:34 UTC, Dukc wrote:
>> This is done automatically for character arrays, which
>> includes strings. wchar arrays wil iterate by UTF-16, and
>> dchar arrays by UTF-32. If you have a byte/ubyte array you
>> know to be unicode-encoded, convert it to char[] to iterate by
>> code points.
>
> Thanks for the response. I was looking for something where I
> don't have to manage buffers myself (e.g. when handling
> buffered file or socket I/O). It's really easy to find this
> functionality in e.g. Python, C#, Go, Kotlin, Java etc. but I'm
> surprised there doesn't seem to be a ready-to-go equivalent in
> D. For example, I can find D examples of opening files and
> reading a line at a time, but no examples of opening a file and
> reading Unicode chars one at a time. Perhaps I've just missed
> them?
import std.file : readText;
import std.uni : byCodePoint, byGrapheme;
// or import std.utf : byCodeUnit, byChar /*utf8*/, byWchar
/*utf16*/, byDchar /*utf32*/, byUTF /*utf8(?)*/;
string a = readText("foo");
foreach(cp; a.byCodePoint)
{
// do stuff with code point 'cp'
}
foreach(g; a.byGrapheme)
{
// do stuff with grapheme 'g'
}
More information about the Digitalmars-d-learn
mailing list