coreutils with D trials, wc, binary vs well formed utf

btiffin btiffin at myopera.com
Mon May 24 16:58:33 UTC 2021


Hello,

New here. A little background.  Old guy, program for both work 
and recreation, GNU maintainer for the GnuCOBOL package; written 
in C, compiles COBOL via C intermediates.  Fell into the role of 
maintainer mainly due to being a documentation writer and early 
on cheerleader.  Experienced in quite a few programming 
languages, with a "10,000ish hours in" definition of expert 
expertise in C, Forth and COBOL.

Assuming that with gdc in GCC mainline now that D usage will 
continue to grow.  Also of the opinion that slow, long tail 
growth is the best kind of growth.  Not hype, not marketing, but 
adoption due to worthiness and merit.  That is the current 
headspace.  Want D to succeed, can't point to a specific why, 
just feel deep down that it should succeed and have an open ended 
relevant life span.  Thanks, Walter, Andrei, Iain, Ari, et al...

Just bumped into 
https://dlang.org/blog/2020/01/28/wc-in-d-712-characters-without-a-single-branch/

Way cool.  Then bumped into this:

prompt$ ./wc *
std.utf.UTFException@/usr/lib/gcc/i686-linux-gnu/11/include/d/std/utf.d(1380): Invalid UTF-8 sequence (at index 1)

That was from an a.out file in the directory.  Early days, very 
limited D, so answers of "just set ...", will fly over head, 
actual gdc command lines and noob jargon will sink in faster at 
this point.  Is there a(n easy-ish) way to fix up that wc.d 
source in the blog to fallback to byte stream mode when a utf-8 
reader fails an encoding?

Have good, make well,
Brian


More information about the Digitalmars-d-learn mailing list