Ping: Daniel Keep
Daniel Keep
daniel.keep+lists at gmail.com
Sun Jan 7 14:36:42 PST 2007
Thomas Kuehne wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Daniel Keep schrieb am 2007-01-07:
>
>>Sorry for the somewhat lacking typesetting job... I started putting
>>inline D code inside [[font][face=Courier]BLAH] blocks, but it quickly
>>got boring and I gave up :P
>>
>>(Also, I couldn't find a way to set blocks off to the side of the main
>>body. Curses!)
>>
>>http://www.prowiki.org/wiki4d/wiki.cgi?DanielKeep/TextInD
>
>
> The following sentence is incorrect:
> #
> # In fact, it does, but there's a teensy problem that some Unicode
> # 'enabled' editors have: they forget the Byte Order Mark.
> #
>
> D doesn't require BOM and the presents of BOMs is application/system
> defined (-> Unicode.org). More often than not the user simply used the
> "save as text" feature. Especially on MSWindows most editors use some
> installation dependent codepage instead of UTF if not ask explicitly to
> store the text as Unicode.
>
> The only situation were a BOM is required is documented here:
> http://d.puremagic.com/issues/show_bug.cgi?id=430
>
> Thomas
>
>
> -----BEGIN PGP SIGNATURE-----
>
> iD8DBQFFoUZjLK5blCcjpWoRAt3PAJ0dtLKauaYKra9WmBDmgibGDAQ7cQCeOEow
> exEjWMkWO5V2aEDO/LQ+vAY=
> =C2XW
> -----END PGP SIGNATURE-----
Thanks for the heads up. Is this accurate?
====
In fact, it does. There are two problems you might run into:
1. The editor you used may *support* Unicode, but didn't end up saving
in it. Go back and double-check that the file really is Unicode. How
you do this depends on your editor, but there's usually an option lying
around somewhere to set a file's character encoding.
2. The other is a bit obscure: if you save your source file in Unicode
without a Byte Order Mark and the first character is outside the ASCII
character range, D won't be able to read it properly.
If you ''do'' have a suspicious looking first character in your source
file, you can either stick a blank line in at the top of the file or
save the source file again with a Byte Order Mark.
[What's a Byte Order Mark?]
The Byte Order Mark (or BOM) is a special character sequence at the
beginning of any UTF text file that tells the application which UTF
encoding is being used, and in some cases what the byte order is (ie:
Little Endian/Big Endian.)
====
-- Daniel
More information about the Digitalmars-d
mailing list