Read non-UTF8 file
spir
denis.spir at gmail.com
Sat Feb 19 11:19:24 PST 2011
On 02/19/2011 02:42 PM, Nrgyzer wrote:
> == Auszug aus Stewart Gordon (smjg_1998 at yahoo.com)'s Artikel
>> On 13/02/2011 21:49, Nrgyzer wrote:
>> <snip>
>>> It compiles and works as long as the returned char-array/string of f.readLine() doesn't
>>> contain non-UTF8 character(s). If it contains such chars, writeln() doesn't write
>>> anything to the console. Is there any chance to read such files?
>> Please post sample input that shows the problem, and the output generated by replacing the
>> writeln call with
>> writefln("%s", cast(ubyte[]) convertToUTF8(f.readLine()));
>> so that we can see what it is actually reading in.
>> Stewart.
>
> My file contains the following:
>
> �
> �
> �
>
> Now... and with writefln("%s", cast(ubyte[]) convertToUTF8(f.readLine())); I get the following:
>
> [195, 131, 164]
> [195, 131, 182]
> [195, 131, 188]
At first sight, I find your input strange. Actually, it looks like utf-8 (195
is common when representing converted latin text). But having 3 times (195,
131) which is the code for 'Ã' is weird.
What is your source text, what is its encoding, and where does it come from?
What don't you /start/ and tell us about that?
Denis
--
_________________
vita es estrany
spir.wikidot.com
More information about the Digitalmars-d-learn
mailing list