Error: 4invalid UTF-8 sequence :: How can I catch this?? (or otherwise handle it)

Charles Hixson charleshixsn at earthlink.net
Sun Nov 1 11:50:43 PST 2009


Charles Hixson wrote:
> Daniel Keep wrote:
>> Charles Hixson wrote:
>>> I want to read a bunch of files, and if the aren't UTF, then I want to
>>> list their names for conversion, or other processing. How should this
>>> be handled??
>>>
>>> try..catch..finally blocks just ignore this error.
>>
>>> type stuff.d
>> import std.stdio;
>> import std.utf;
>>
>> void main()
>> {
>> try
>> {
>> writefln("A B \xfe C");
>> }
>> catch( UtfException e )
>> {
>> writefln("I caught a %s!", e);
>> }
>> }
>>
>>> dmd stuff && stuff
>> A B I caught a 4invalid UTF-8 sequence!
>>
>> Works for me.
>
> Sorry, the error is on the read. The code I tried to use was:
>
> try { lin = fil.readLine; }
> catch
> { writefln("File <<" ~ filIter [curFilNdx] ~ ">> is not a valid UTF
> file.");
> fil.close;
> getLine;
> return;
> }
> finally
> { }
> debug (9) writefln ("lin = <<" ~ lin ~ ">>");
> try
> { validate (lin); }
> catch (UtfException ue)
> { writefln ("File <<" ~ filIter [curFilNdx] ~ ">> is not a valid UTF
> file.");
> fil.close;
> getLine;
> return;
> }
>
> where fil is a File and getLine is one of my routines that automatically
> switches to the next file if the current file has been closed.

For some reason when I explicitly put the (UtfException ue) on the catch 
statement that I'd been trying to use to catch everything (i.e., just a 
blank catch) it works.

I'm not sure whether I misunderstand how the unlabeled catch works in D, 
or whether something really strange is going on.  The documentation 
seems to say that an unlabeled catch statement catches everything, but 
it doesn't catch the UtfException.  When the UtfException is explicitly 
listed it works.  (Admittedly I altered the code a lot, trying lots of 
different things, before I tried just using an explicit:
    catch (UtfException ue)

What I finally ended up with that worked was
   while (!curFile.eof)
   {  ...
      try
      {  s  =  curFile.readLine;
         std.utf.validate (s);
      }
      catch  (UtfException ue)
      {  writef ("\n  err at <<" ~ fileName ~ ">>line "
                  ~ std.string.toString (line));
         if (++errs > 3)	
         {  writefln ("\ntoo many errs");	
            break;	
         }
      }
   }
with curFile a std.File.  I don't know whether a BufferedFile would have 
worked.


More information about the Digitalmars-d-learn mailing list