unicode characters are not printed correctly on the windows command line?

Steven Schveighoffer schveiguy at gmail.com
Sun Dec 22 22:47:43 UTC 2019


On 12/22/19 5:04 PM, Adam D. Ruppe wrote:
> On Sunday, 22 December 2019 at 18:41:16 UTC, Steven Schveighoffer wrote:
>> Phobos doesn't call the wrong function, libc does. Phobos uses fwrite 
>> for output.
> 
> There is allegedly a way to set fwrite to do the translations on MSVCRT:
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=vs-2019 

Looks like you need to switch to "wprintf". I'm not sure, but I think we 
rely only on fwrite, for which there is no "w" equivalent.

> but trying it here it throws invalid parameter exception so idk.

Not surprised ;)

Here's a cool feature of Windows:

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fwide?view=vs-2019

Basically does nothing, all parameters ignored (and yes, we use this 
function in Phobos, assuming it does something).

But let me just say, the fact that there is some "mode" you have to set, 
like binary mode, that makes unicode work is unsettling. I hate libc 
streams...

> 
> Regardless, I'm pretty well of the opinion that fwrite is the wrong 
> thing to do anyway. fwrite writes bytes to a file, but we want to write 
> strings to the console. There's other functions that do that.

Preaching to the choir here. I wanted to rip out libc reliance a decade ago.

> There is the worry of mixing stuff from C and keeping the buffer 
> consistent, but it could always just flush() before doing its thing too. 
> Or maybe even merge the buffers, idk what the MS runtime supports for that.

This is the crux. Some people gotta have their printf. And if you do 
different types of buffered streams, the result even from 
single-threaded output looks like garbage. The only solution is to wrap 
FILE *. And I do mean only. I looked into trying to hook the buffers. 
There's no reliable way without knowing all the implementation details.

> or maybe i'm missing something and _setmode is a viable solution.

_setmode is on a file descriptor. That already is a red flag to me, as 
there are no file descriptors in the OS. Windows use handles. So this 
has some weird library "translation" happening underneath. Ugh.

> But whatever we do, passing the buck isn't solving anything. Windows has 
> supported Unicode console output since NT 4.0 in 1996.. just have to 
> call the right function, and whether it is Phobos calling it or druntime 
> or the CRT, someone just needs to do it!

Hey, you can always just call the function yourself! Just make an output 
stream that writes with the right function, and then you can use 
formattedWrite instead of writef.

To fix Phobos, we just(!) need to remove libc as the underlying stream 
implementation.

I had at one point agreement from Walter to make a 
"backwards-compatible-ish" mechanism for file/streams. But it's not 
pretty, and was convoluted. At the time, I was struggling getting what 
would become iopipe to be usable on its own, and I eventually quit 
worrying about that aspect of it.

We have the basic building blocks with https://github.com/MartinNowak/io 
and https://github.com/schveiguy/iopipe. It would be cool to get this 
into Phobos, but it's a lot of work.

I bet Rust just skips libc altogether.

-Steve


More information about the Digitalmars-d-learn mailing list