invalid utf-8 sequence

Jarrett Billingsley jarrett.billingsley at gmail.com
Tue Jan 6 17:11:39 PST 2009


On Tue, Jan 6, 2009 at 8:04 PM, james <Jamesg4 at gmail.com> wrote:
> im writing an indexer, but im having a problem because on some file, when i read gives this error
>
> Error 4: invalid UTF-8 sequence
>
> is there a way to fix it.
>

You're probably reading a file that's encoded in some non-Unicode
encoding, like Latin-1.  You could read in the file data as byte[]
instead of as char[], but that still doesn't deal with the problem
that you have characters in your file that are outside the ASCII
range.  If you know what encoding your file uses, you could do some
transformations on it to turn it into valid Unicode, or you could just
ignore characters outside the ASCII range :P


More information about the Digitalmars-d-learn mailing list