regarding Latin1 to UTF8 encoding

Hugo Florentino hugo at acdam.cu
Sun Dec 8 18:39:57 PST 2013


Hi,

I am having some problems trygin to pass regular expressions to a 
webpage encoded in Latin1. I have unsuccessfully tried to convert it to 
UTF8 before passing the regular expression.

Initially I tried to do something like this:

auto input = readText("myfile.htm");
auto output = replace(input, re1, re2);

But I got this error when trying to run the application:
std.utf.UTFException at C:\DMD2\Windows\bin\..\..\src\phobos\std\utf.d(1113): 
Invalid UTF-8 sequence (at index 1)

I then tried this, but the error remains

auto input = readText("myfile.htm");
string buffer;
transcode(input, buffer);
auto output = replace(buffer, re1, re2);

Also, this did not work:

auto input = cast(string) read("myfile.htm");
string buffer;
transcode(input, buffer);
auto output = replace(buffer, re1, re2);

core.exception.AssertError at std.encoding(1995): Assertion failure

Please, any help would be appreciated.

Regards, Hugo


More information about the Digitalmars-d-learn mailing list