How to decode UTF-8 text?

kdevel kdevel at vogtner.de
Wed Mar 27 19:16:21 UTC 2019


On Wednesday, 27 March 2019 at 13:39:07 UTC, Andrey wrote:
> I have got some text with UTF-8. For example this part:
>> <title>Παράλληλη αναζήτηση</title>

This looks like a UTF-8 sequence which has been UTF-8 encoded.

> How to decode it to get this result?
>> <title>Παράλληλη αναζήτηση</title>

Undo the second UTF-8 encoding by transcoding the UTF-8 into an 
8-bit character set (latin1, windows-1252 etc.) which you have to 
guess.

> I have tried functions like "decode", "byUTF", "to!wchar"... 
> but no success.
>
> Input string is correct - checked it with 
> "https://www.browserling.com/tools/utf8-decode".

```decode.d
import std.stdio;
import std.encoding;

void main ()
{
    string src = "<title>Î\u00a0αράλληλη 
αναζήτηση</title>";
    Latin1String ls;
    transcode (src, ls);
    string targ = cast (string) ls;
    targ.writeln;
}
```
$ ./decode
<title>Παράλληλη αναζήτηση</title>


More information about the Digitalmars-d-learn mailing list