What library functionality would you most like to see in D?

Jonathan M Davis jmdavisProg at gmx.com
Sun Aug 7 03:21:05 PDT 2011


On Sunday 07 August 2011 14:08:06 Dmitry Olshansky wrote:
> On 07.08.2011 12:09, Mehrdad wrote:
> > A readText() function that would read a text file (**and** autodetect
> > its encoding from its BOM) would be of great help.
> 
> Well the name is here, dunno if it meets your expectations:
> http://d-programming-language.org/phobos/std_file.html#readText

D (and Phobos in general) assumes that char in UTF-8, wchar is UTF-16, and 
dchar is UTF-32. You're going to get an exception thrown pretty quickly if 
you're trying to use those types with values that don't match those encodings. 
As such, readText assumes that the file is in whatever encoding the character 
type is that it's instantiated with. So, if you try and read in a file which 
doesn't match the character encoding of the character type that you're using 
(which is char by default), you're going to get a UtfException.

What Mehrdad wants is a way to read in a file with an encoding other than 
UTF-8, UTF-16, or UTF-32, have it autodetect the encoding by reading the file's 
BOM, and then convert it it to whatever encoding is that the character type 
that readText is using uses. readText doesn't currently do anything of the 
sort. 

At this point, dealing with anything which has an encoding other than UTF-8, 
UTF-16, or UTF-32 is problematic in D. std.encoding helps, but it's not 
necessarily all that good (Andrei considers it a failed experiment which 
either needs to be redesigned or removed). So, one of the things that still 
needs to be figured out for Phobos is how to better handle encodings other than 
UTF-8, UTF-16, and UTF-32. For the most part, other encodings are likely to be 
dealt with only when reading or writing I/O while UTF-8, UTF-16, and UTF-32 
are dealt with inside of D programs, but we still need to fix things so that we 
can readily deal with I/O that isn't UTF-8, UTF-16, or UTF-32.

- Jonathan M Davis


More information about the Digitalmars-d mailing list