Reading binary streams with decoding to Unicode

Dukc ajieskola at gmail.com
Mon Oct 15 17:55:34 UTC 2018


On Monday, 15 October 2018 at 10:49:49 UTC, Vinay Sajip wrote:
> Is there a standardised way of reading over buffered binary 
> streams (at least strings, files, and sockets) where you can 
> layer a decoder on top, so you get a character stream you can 
> read one Unicode char at a time? Initially UTF-8, but later 
> also other encodings. I see that std.stream was deprecated, but 
> can't see what other options there are. Can anyone point me in 
> the right direction?

This is done automatically for character arrays, which includes 
strings. wchar arrays wil iterate by UTF-16, and dchar arrays by 
UTF-32. If you have a byte/ubyte array you know to be 
unicode-encoded, convert it to char[] to iterate by code points.

Vice-versa, if you want to iterate a character array by code 
unit, convert it to ubyte[]/ushort[] (depending on code unit 
length) or use std.utf.byCodeUnit


More information about the Digitalmars-d-learn mailing list