Files and UTF

Mike Surette mjsurette at
Wed Aug 5 17:39:36 UTC 2020

In my efforts to learn D I am writing some code to read files in 
different UTF encodings with the aim of having them end up as 
UTF-8 internally. As a start I have the following code:

import std.stdio;
import std.file;

void main(string[] args)
     if (args.length == 2)
         if (args[1].exists && args[1].isFile)
             auto f = File(args[1]);

             for (auto i = 1; i <= 3; ++i)

It works well outputting the file name and first three lines of 
the file properly, without any regard to the encoding of the 
file. The exception to this is if the file is UTF-16, with both 
LE and BE encodings, two characters representing the BOM are 

I assume that write detects the encoding of the string returned 
by readln and prints it correctly rather than readln reading in 
as a consistent encoding. Is this correct?

Is there a way to remove the BOM from the input buffer and still 
know the encoding of the file?

Is there a D idiomatic way to do what I want to do?


More information about the Digitalmars-d-learn mailing list