Files and UTF
mjsurette at gmail.com
Wed Aug 5 17:39:36 UTC 2020
In my efforts to learn D I am writing some code to read files in
different UTF encodings with the aim of having them end up as
UTF-8 internally. As a start I have the following code:
void main(string args)
if (args.length == 2)
if (args.exists && args.isFile)
auto f = File(args);
for (auto i = 1; i <= 3; ++i)
It works well outputting the file name and first three lines of
the file properly, without any regard to the encoding of the
file. The exception to this is if the file is UTF-16, with both
LE and BE encodings, two characters representing the BOM are
I assume that write detects the encoding of the string returned
by readln and prints it correctly rather than readln reading in
as a consistent encoding. Is this correct?
Is there a way to remove the BOM from the input buffer and still
know the encoding of the file?
Is there a D idiomatic way to do what I want to do?
More information about the Digitalmars-d-learn