UTFException when reading a file
Adam D. Ruppe
destructionator at gmail.com
Fri Jan 11 20:04:58 UTC 2019
On Friday, 11 January 2019 at 19:45:05 UTC, Head Scratcher wrote:
> How can I read the file and convert the string into proper
> UTF-8 in memory without an exception?
Use regular read() instead of readText, and then convert it use
another function.
Phobos has std.encoding which offers a transcode function:
http://dpldocs.info/experimental-docs/std.encoding.transcode.html
you would cast to the input type:
---
import std.encoding;
import std.file;
void main() {
string s;
// the read here replaces your readText
// and the cast tells what encoding it has now
transcode(cast(Latin1String) read("ooooo.d"), s);
import std.stdio;
// and after that, the utf-8 string is in s
writeln(s);
}
---
Or, since I didn't like the Phobos module for my web scrape
needs, I made my own:
https://github.com/adamdruppe/arsd/blob/master/characterencodings.d
Just drop that file in your build and call this function:
http://dpldocs.info/experimental-docs/arsd.characterencodings.convertToUtf8Lossy.html
---
import arsd.characterencodings;
import std.file;
void main() {
string s = convertToUtf8Lossy(read("ooooo.d"), "iso_8859-1");
// you can now use s
}
---
just changing the encoding string to whatever it happens to be
right now.
But it is possible neither my module nor the Phobos one has the
encoding you need...
More information about the Digitalmars-d-learn
mailing list