Read non-UTF8 file

spir denis.spir at gmail.com
Sat Feb 19 11:19:24 PST 2011


On 02/19/2011 02:42 PM, Nrgyzer wrote:
> == Auszug aus Stewart Gordon (smjg_1998 at yahoo.com)'s Artikel
>> On 13/02/2011 21:49, Nrgyzer wrote:
>> <snip>
>>> It compiles and works as long as the returned char-array/string of f.readLine() doesn't
>>> contain non-UTF8 character(s). If it contains such chars, writeln() doesn't write
>>> anything to the console. Is there any chance to read such files?
>> Please post sample input that shows the problem, and the output generated by replacing the
>> writeln call with
>>       writefln("%s", cast(ubyte[]) convertToUTF8(f.readLine()));
>> so that we can see what it is actually reading in.
>> Stewart.
>
> My file contains the following:
>
>>>>
> Now... and with writefln("%s", cast(ubyte[]) convertToUTF8(f.readLine())); I get the following:
>
> [195, 131, 164]
> [195, 131, 182]
> [195, 131, 188]

At first sight, I find your input strange. Actually, it looks like utf-8 (195 
is common when representing converted latin text). But having 3 times (195, 
131) which is the code for 'Ã' is weird.
What is your source text, what is its encoding, and where does it come from? 
What don't you /start/ and tell us about that?

Denis
-- 
_________________
vita es estrany
spir.wikidot.com



More information about the Digitalmars-d-learn mailing list