Proper way to fix core.exception.UnicodeException at src\rt\util\utf.d(292): invalid UTF-8 sequence by File.readln()
Dr.No
jckj33 at gmail.com
Fri Apr 6 16:10:56 UTC 2018
I'm reading line by line the lines from a CSV file provided by
the user which is assumed to be UTF8. But an user has provided an
ANSI file which resulted in the error:
>core.exception.UnicodeException at src\rt\util\utf.d(292): invalid
>UTF-8 sequence
(it happend when the user took the originally UTF8 encoded file
generated by another application, made some edit using an editor
(which I don't know the name) then saved not aware it was
changing the encoding to ANSI.
My question is: what's the proper way to solve that? using toUTF8
didn't solve:
> while((line = csvFile.readln().toUTF8) !is null) {
I didn't find a way to set explicitly the encoding with
std.stdio.File to set to UTF8 regardless it's an ANSI or already
UTF8.
I don't want to conver the whole file to UTF8, the CSV file can
be large and might take quite while. And if I do so to a
temporary copy the file (which will make things even more slow)
to avoid touch user's original file.
I thought in writing my own readLine() with
std.stdio.File.byChunk to take as many bytes as possible until
'\n' byte is seen, treat it as UTF8 and return.
But I'd like to not reinvent the wheel and use something native,
if possible. Any ideas?
More information about the Digitalmars-d-learn
mailing list