What library functionality would you most like to see in D?
Jonathan M Davis
jmdavisProg at gmx.com
Sun Aug 7 03:21:05 PDT 2011
On Sunday 07 August 2011 14:08:06 Dmitry Olshansky wrote:
> On 07.08.2011 12:09, Mehrdad wrote:
> > A readText() function that would read a text file (**and** autodetect
> > its encoding from its BOM) would be of great help.
>
> Well the name is here, dunno if it meets your expectations:
> http://d-programming-language.org/phobos/std_file.html#readText
D (and Phobos in general) assumes that char in UTF-8, wchar is UTF-16, and
dchar is UTF-32. You're going to get an exception thrown pretty quickly if
you're trying to use those types with values that don't match those encodings.
As such, readText assumes that the file is in whatever encoding the character
type is that it's instantiated with. So, if you try and read in a file which
doesn't match the character encoding of the character type that you're using
(which is char by default), you're going to get a UtfException.
What Mehrdad wants is a way to read in a file with an encoding other than
UTF-8, UTF-16, or UTF-32, have it autodetect the encoding by reading the file's
BOM, and then convert it it to whatever encoding is that the character type
that readText is using uses. readText doesn't currently do anything of the
sort.
At this point, dealing with anything which has an encoding other than UTF-8,
UTF-16, or UTF-32 is problematic in D. std.encoding helps, but it's not
necessarily all that good (Andrei considers it a failed experiment which
either needs to be redesigned or removed). So, one of the things that still
needs to be figured out for Phobos is how to better handle encodings other than
UTF-8, UTF-16, and UTF-32. For the most part, other encodings are likely to be
dealt with only when reading or writing I/O while UTF-8, UTF-16, and UTF-32
are dealt with inside of D programs, but we still need to fix things so that we
can readily deal with I/O that isn't UTF-8, UTF-16, or UTF-32.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list