removing ansi control escape characters from a string

Timothee Cour thelastmammoth at gmail.com
Fri May 31 18:08:33 PDT 2013


Not sure this is the right forum but here we go:

When using unix command 'script' to log the terminal input/output commands,
it includes special ansi control escape characters.
I'd like to filter out the generated script file from those character
sequences, so that it preserves the content (including newlines) but
removes escape codes for coloring, etc.
That way I can apply tools like grep as if those characters were absent.

Here's an example, after a short script session where I typed 1234 BS BS BS
BS 6789 (ie 4 backspaces), saved into log.txt.
Typing 'cat log.txt|grep 1234' returns '6789', even though 1234 doesn't
appear once we cat the file in a terminal, because cat-ing in a terminal
replays the backspace sequence, but 1234 really is in the file.

So I'd like help on writing a D utility function that will convert a string
s1 into a string s2 such that:
* writeln(s1) prints the same as writeln(s2) (modulo removing colors from
escape sequences)
* s2 doesn't contain any escape sequence (as given by std.uni.isControl)

Note, I'm NOT just looking into filtering out escape the 'isControl'
characters (that's easy), because that leaves all the '[33m' garbage in the
string; also the behavior of backspace needs to be emulated.

Here's a first stab at the problem, but it's incomplete (ie doesn't deal
with backspaces which should erase a char from the string etc).
string remove_terminal_escape_codes(string a){
string pattern=(){
import std.uni;
import std.conv;
import std.range;
string pattern="[";
foreach(char ci; 0..255){
if(isControl(ci)){
pattern~=ci.to!dchar;
}
}
pattern~="]";
return pattern~`\[(\d{2}m|\dm|\d\d;\d\dm|J|K|\d\d[A-Z])`;
}();
import std.regex;
return replace(a,regex(pattern,"g"),"");
}





More details:

cat -v log.txt:
-------------------------
Script started on Fri May 31 17:32:55 2013
^[[1m^[[7m%^[[27m^[[1m^[[0m


                                           ^M
^M^M^[[0m^[[27m^[[24m^[[J^[[32mprompt:M ^[[33m~/shortcuts
^[[00m%^[[K^[[199C^[[33mprompt_end:^[[1m17:32^[[0m
^[[1m#27378^[[0m^[[222D1^H1234^H ^H^H ^H^H^H1 ^H^H ^H6^H6789^M^M
^[[1m^[[7m%^[[27m^[[1m^[[0m


                                           ^M
^M^M^[[0m^[[27m^[[24m^[[J^[[32mprompt:M ^[[33m~/shortcuts
^[[00m%^[[K^[[199C^[[34mprompt_end:Err 1^[[37m ^[[1m#27378^[[0m^[[222D^M^M

Script done on Fri May 31 17:33:08 2013
-------------------------

cat log.txt:
-------------------------
Script started on Fri May 31 17:32:55 2013
prompt:M ~/shortcuts %6789


prompt_end:17:32 #27378
prompt:M ~/shortcuts %


prompt_end:Err 1 #27378

Script done on Fri May 31 17:33:08 2013
-------------------------

I can attach the file as .txt if forum allows.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d-learn/attachments/20130531/3b13658f/attachment-0001.html>


More information about the Digitalmars-d-learn mailing list