D and Unicode(UTF16) strings

Stewart Gordon smjg_1998 at yahoo.com
Fri Jul 25 08:45:37 PDT 2008


"Lionello Lunesu" <lionello at lunesu.remove.com> wrote in message 
news:g6bga9$2dmh$1 at digitalmars.com...
<snip>
> Although, coming from C++, that might seem a good idea at first, note that 
> Windows doesn't quite know about UTF8. It can convert UTF8 to UNICODE and 
> back, but apart from the MultiByteToWideChar-like functions you cannot 
> pass UTF8 (ie. string, char[]) to any ANSI Windows API.

Check out std.windows.charset.

> The ANSI functions all use the current thead code page for conversion, 
> which cannot be set to UTF8. (God knows I've tried. If anybody managed to 
> do just this, pls let me know how.)
>
> I'd suggest to stick to wstring/Unicode. Most Unicode APIs are also 
> available on Win95 so there should be little reason to use the ANSI 
> functions for any Windows application.

I've never established which Unicode APIs are implemented on Win9x.  There 
ought to be documentation on this.  There's also a thing called Microsoft 
Layer for Unicode, but annoyingly, there seems to be no convenient way for 
apps to use it iff it's installed.

> Trying to use UTF8 on Windows means that you'll either have to constantly 
> convert the UTF8 strings to Unicode yourself, or use byte[] instead of 
> "string" to prevent any errors using Phobos/Tango APIs that assume 
> char[]/string contains UTF8.

Just not using the Phobos/Tango string functions would do this, whether you 
store your strings as byte[], ubyte[] or char[].

> Anyway, that's what I've found out while messing with unicode/ansi stuff 
> on Windows. It might even be outdated at this point..

I guess it depends on which Windows versions you're targeting....

Stewart.

-- 
My e-mail address is valid but not my primary mailbox.  Please keep replies 
on the 'group where everybody may benefit. 




More information about the Digitalmars-d mailing list