UTFException when reading a file

Adam D. Ruppe destructionator at gmail.com
Fri Jan 11 20:04:58 UTC 2019


On Friday, 11 January 2019 at 19:45:05 UTC, Head Scratcher wrote:
> How can I read the file and convert the string into proper 
> UTF-8 in memory without an exception?

Use regular read() instead of readText, and then convert it use 
another function.

Phobos has std.encoding which offers a transcode function:

http://dpldocs.info/experimental-docs/std.encoding.transcode.html

you would cast to the input type:

---
import std.encoding;
import std.file;

void main() {
         string s;
         // the read here replaces your readText
         // and the cast tells what encoding it has now
         transcode(cast(Latin1String) read("ooooo.d"), s);
         import std.stdio;
         // and after that, the utf-8 string is in s
         writeln(s);
}
---


Or, since I didn't like the Phobos module for my web scrape 
needs, I made my own:

https://github.com/adamdruppe/arsd/blob/master/characterencodings.d

Just drop that file in your build and call this function:

http://dpldocs.info/experimental-docs/arsd.characterencodings.convertToUtf8Lossy.html

---
import arsd.characterencodings;
import std.file;

void main() {
      string s = convertToUtf8Lossy(read("ooooo.d"), "iso_8859-1");
      // you can now use s
}
---

just changing the encoding string to whatever it happens to be 
right now.



But it is possible neither my module nor the Phobos one has the 
encoding you need...


More information about the Digitalmars-d-learn mailing list