std.utf.decode behaves unexpectedly - Bug?

BBaz via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri Nov 6 12:00:42 PST 2015


Sorry, the forum as stripped my answer. Here is the full version:

On Friday, 6 November 2015 at 19:26:50 UTC, HeiHon wrote:
> Am I using std.utf.decode wrongly or is it buggy?

It's obviously used wrongly, try this instead:

import std.utf, std.stdio;

---
dstring do_decode(string txt)
{
     dstring result;
     try
     {
         size_t idx;
         writeln("decode ", txt);
         while (true)
         {
             result ~= std.utf.decode(txt, idx);
             if (idx == txt.length) break;
         }
     }
     catch(Exception e)
     {
         writeln(e.msg, " file=", e.file, " line=", e.line);
     }
     return result;
}

void main()
{
     writeln(do_decode("abc"));
     writeln(do_decode("Ã¥bc"));
     writeln(do_decode("aåb"));
}
---
Additionally to what's been said in the other answers there was 
also another error:
the `for()` loop was working on code points while there are 
possibly less code units in `txt`. So instead you can use an 
infinite loop and break when `txt` is decoded.

Alternatively you could also use std.range primitives to decode, 
which can be considered as a more idiomatic way of doing things, 
e.g:

---
import std.utf, std.stdio, std.range;

dstring do_decode(string txt)
{
     dstring result;
     try
     {
         size_t idx;
         writeln("decode ", txt);
         while (true)
         {
             if (txt.empty) break;
             result ~= txt.front;
             txt.popFront;
         }
     }
     catch(Exception e)
     {
         writeln(e.msg, " file=", e.file, " line=", e.line);
     }
     return result;
}

void main()
{
     writeln(do_decode("abc"));
     writeln(do_decode("Ã¥bc"));
     writeln(do_decode("aåb"));
}
---

because `front` auto decodes it argument.

To finish, a hint: you can use the unit tests found in phobos to 
learn how to use a particular function. Usually there are more 
than the one put as ddoc.


More information about the Digitalmars-d-learn mailing list