std.stream, BOM, and deprecation
Ali Çehreli
acehreli at yahoo.com
Sat Oct 13 19:15:32 PDT 2012
On 10/13/2012 06:53 PM, Charles Hixson wrote:
> If std.stream is being deprecated, what is the correct way to deal with
> file BOMs. This is particularly concerning utf8 files, which I
> understand to be a bit problematic, as there isn't, actually, a utf8
> BOM,
That's correct. There is just one byte order for UTF-8.
> merely a convention which isn't a part of a standard.
I am not sure about that. The Unicode standard describes UTF-8 as code
units following each other in the file. There can't be any confusion
about their order. According to Wikipedia, the only use of BOM for UTF-8
is to identify the file as having been encoded in UTF-8:
http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
But that can't have any meaning. The file could have been encoded in any
one of the multitude of code pages as well. Treating the first three
bytes as BOM would be taking a chance in that case and dropping those
three characters.
> But the
> std.stdio documentation doesn't so much as mention byte order marks
(BOMs).
>
> If this should wait until std.io is released, then I could use
> std.stream until them, but the documentation is already warning to avoid
> using it.
As I understand it, it is all down to convention any way. What is the
meaning of the non-ASCII code 166? Only the generator of the file knows. :/
Ali
More information about the Digitalmars-d-learn
mailing list