read files ... continued

Jan Hanselaer jan.hanselaer at gmail.com
Sun May 13 03:35:40 PDT 2007


Woops ... sent before done writing ... sorry

Hi

I'm writing an application that reads all kind of text files.
I'm not really familiar with the filetypes.
For the moment I read them with a BufferedFile.
I read the lines with readLine()

Stream br = new BufferedFile(fileName);
char[] line = br.readLine();

But that causes a lot of trouble. I managed to figure out how to read a file
his BOM and so It'll also be possible I presume to convert them to a type
(UTF8 for example) that I always use. (I'm checking that later)
But for a lot of files when I check the BOM I get result -1 (meaning the 
type is not known).
http://www.digitalmars.com/d/phobos/std_stream.html
The only known BOM types are listed there (UTF8,UTF16,UTF32 LE or BE)

For a lot of text files on my system (windows) the type is ANSI, and there's 
no problem reading them with BufferedFile if there are no special signs in 
it.
But if there's an accent or something (for example 'é'), than it's an 
invalid UTF sequence. I cannot convert the text because the BOM for this 
files is also unknown.

Anyone has an idea of how to catch this sort of files (and convert them?) Or 
is there a stream that takes into account the filetype by itself? Would be 
very handy ...

It's an application I wrote in Java I'm now trying in D. In Java I used a 
BufferedReader on A FileReader and there all goes well. Sometimes files are 
not read well, but no faults like this invalid UTF-sequence in D.

If someone unterstands my problem out of all this confusing talk (that's 
because I'm rather confused myself) ... I'd be glad :p

Thanks!





More information about the Digitalmars-d-learn mailing list