Checking, whether string contains only ascii.

Ali Çehreli via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Wed Feb 22 12:07:34 PST 2017


On 02/22/2017 12:02 PM, ag0aep6g wrote:
 > On Wednesday, 22 February 2017 at 19:26:15 UTC, berni wrote:
 >> In my program, I read a postscript file. Normal postscript files
 >> should only be composed of ascii characters, but one never knows what
 >> users give us. Therefore I'd like to make sure that the string the
 >> program read is only made up of ascii characters. This simplifies the
 >> code thereafter, because I then can assume, that codeunit==codepoint.
 >> Is there a simple way to do so?
 >>
 >> Here a sketch of my function:
 >>
 >>> void foo(string postscript)
 >>> {
 >>>    // throw Exception, if postscript is not all ascii
 >>>    // other stuff, assuming codeunit=codepoint
 >>> }
 >
 > Making full use of the standard library:
 >
 > ----
 > import std.algorithm: all;
 > import std.ascii: isASCII;
 > import std.exception: enforce;
 >
 > enforce(postscript.all!isASCII);
 > ----
 >
 > That checks on the code point level (because strings are ranges of
 > dchars). If you want to be clever, you can avoid decoding and check on
 > the code unit level:
 >
 > ----
 > /* other imports as above */
 > import std.utf: byCodeUnit;
 >
 > enforce(postscript.byCodeUnit.all!isASCII);
 > ----
 >
 > Or you can do it manually, avoiding all those imports:
 >
 > ----
 > foreach (char c; postscript) if (c > 0x7F) throw new Exception("not
 > ASCII");
 > ----

One more:

bool isAscii(string s) {
     import std.string : representation;
     import std.algorithm : canFind;
     return !s.representation.canFind!(c => c >= 0x80);
}

unittest {
     assert(isAscii("hello world"));
     assert(!isAscii("hellö wörld"));
}

Ali



More information about the Digitalmars-d-learn mailing list