Reading text (I mean "real" text...)
Denis
noreply at noserver.lan
Sat Jun 20 07:32:51 UTC 2020
Digging into this a bit further --
POSIX defines a "print" class, which I believe is an exact fit.
The Unicode spec doesn't define this class, which I presume is
why D's std.uni library also omits it. But there is an isprint()
function in libc, which I should be able to use (POSIX here).
This function refers to the system locale, so it isn't limited to
ASCII characters (unlike std.ascii:isPrintable).
So that's one down, two to go:
Loop until newline or EOF
(1) Read bytes or character } Possibly
(2) Decode UTF-8, exception if invalid } together
(3) Call isprint(), exception if invalid
Return line
(This simplified outline obviously doesn't show how to deal with
the complications arising from using buffers, handling codepoints
that straddle the end of the buffer, etc.)
Where I'm still stuck is the read or read-and-auto-decode: this
is where the waters get really muddy for me. Three different
techniques for reading characters are suggested in this thread
(iopipe, ranges, rawRead):
https://forum.dlang.org/thread/cgteipqqfxejngtpgbbt@forum.dlang.org
I'd like to stick with standard D or C libraries initially, so
that rules out iopipe for now. What would really help is some
details about what one read technique does particularly well vs.
another. And is there a technique that seems more suited to this
use case than the rest?
Thanks again
More information about the Digitalmars-d-learn
mailing list