extended characterset output
anonymous
anon at ymous.org
Fri Apr 8 09:51:23 UTC 2022
On Friday, 8 April 2022 at 08:36:33 UTC, Ali Çehreli wrote:
> On 4/7/22 23:13, anonymous wrote:
> > What's the proper way to output all characters in the
> extended character
> > set?
>
> It is not easy to answer because there are a number of concepts
> here that may make it trivial or complicated.
>
> The configuration of the output device matters. Is it set to
> Windows-1252 or are you using Unicode strings in Python?
I'm running Ubuntu and my default language is en_US.UTF-8.
> >
> > ```d
> > void main()
> > {
> > foreach(char c; 0 .. 256)
>
> 'char' is wrong there because 'char' has a very special meaning
> in D: A UTF-8 code unit. Not a full Unicode character in many
> cases, especially in the "extended" set.
>
> I think your problem will be solved simply by replacing 'char'
> with 'dchar' there:
>
> foreach (dchar c; ...
I tried that. It didn't work.
> However, isControl() below won't work because isControl() only
> knows about the ASCII table. It would miss the unprintable
> characters above 127.
>
> > {
> > write(isControl(c) ? '.' : c);
> > }
> > }
> > ```
Oh okay, that may have been the reason.
> This works:
>
> import std.stdio;
>
> bool isPrintableLatin1(dchar value) {
> if (value < 32) {
> return false;
> }
>
> if (value > 126 && value < 161) {
> return false;
> }
>
> return true;
> }
>
> void main() {
> foreach (dchar c; 0 .. 256) {
> write(isPrintableLatin1(c) ? c : '.');
> }
Nope... running this code, I get a bunch of digits as the output.
The dot's don't even show up. Maybe I'm drunk or lacking sleep.
Weird, I got this strange feeling that this problem stemmed from
the compiler I'm using (GDC) so I installed DMD. Would you
believe everything worked fine afterwords? To include the
original version where I used isControl and 'dchar' instead of
'char'. I wonder why that is?
Thanks Ali.
More information about the Digitalmars-d-learn
mailing list