extended characterset output

Ali Çehreli acehreli at yahoo.com
Fri Apr 8 08:36:33 UTC 2022


On 4/7/22 23:13, anonymous wrote:
 > What's the proper way to output all characters in the extended character
 > set?

It is not easy to answer because there are a number of concepts here 
that may make it trivial or complicated.

The configuration of the output device matters. Is it set to 
Windows-1252 or are you using Unicode strings in Python?

 >
 > ```d
 > void main()
 > {
 >      foreach(char c; 0 .. 256)

'char' is wrong there because 'char' has a very special meaning in D: A 
UTF-8 code unit. Not a full Unicode character in many cases, especially 
in the "extended" set.

I think your problem will be solved simply by replacing 'char' with 
'dchar' there:

   foreach (dchar c; ...

However, isControl() below won't work because isControl() only knows 
about the ASCII table. It would miss the unprintable characters above 127.

 >      {
 >         write(isControl(c) ? '.' : c);
 >      }
 > }
 > ```

This works:

import std.stdio;

bool isPrintableLatin1(dchar value) {
   if (value < 32) {
     return false;
   }

   if (value > 126 && value < 161) {
     return false;
   }

   return true;
}

void main() {
   foreach (dchar c; 0 .. 256) {
     write(isPrintableLatin1(c) ? c : '.');
   }

   writeln();

   // import std.encoding;

   // foreach(ubyte c; 0 .. 256) {
   //   if (isPrintableLatin1(c)) {
   //     Latin1Char[1] from = [ cast(Latin1Char)c ];
   //     string to;
   //     transcode(from, to);
   //     write(to);

   //   } else {
   //     write('.');
   //   }
   // }

   // writeln();
}

I left some code commented-out, which I experimented with. (That works 
as well.)

Ali



More information about the Digitalmars-d-learn mailing list