Always std.utf.validate, or rely on exceptions?
SimonN via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Thu Mar 2 08:20:30 PST 2017
Many functions in std.utf throw UTFException when we pass them
malformed UTF, and many functions in std.string throw
StringException. From this, I developed a habit of reading user
files like so, hoping that it traps all malformed UTF:
try {
// call D standard lib on string from file
}
catch (Exception e) {
// treat file as bogus
// log e.msg
}
But std.string.stripRight!string calls std.utf.codeLength, which
doesn't ever throw on malformed UTF, but asserts false on errors:
ubyte codeLength(C)(dchar c) @safe pure nothrow @nogc
if (isSomeChar!C)
{
static if (C.sizeof == 1)
{
if (c <= 0x7F) return 1;
if (c <= 0x7FF) return 2;
if (c <= 0xFFFF) return 3;
if (c <= 0x10FFFF) return 4;
assert(false);
}
// ...
}
Apparently, once my code calls stripRight, I should be sure that
this string contains only well-formed UTF. Right now, my code
doesn't guarantee that.
Should I always validate text from files manually with
std.utf.validate?
Or should I memorize which functions throw, then validate
manually whenever I call the non-throwing UTF functions? What is
the pattern behind what throws and what asserts false?
-- Simon
More information about the Digitalmars-d-learn
mailing list