My simple implementation of PHP strip_tags()

aberba via Digitalmars-d digitalmars-d at puremagic.com
Wed Jun 28 11:08:12 PDT 2017


I wanted strip_tags() for sanitization in vibe.d and I set out 
for algorithms on how to do it and came across this JavaScript 
library at 
https://github.com/ericnorris/striptags/blob/master/src/striptags.js which is quite popular judging by the number of likes and forks. As a looked through, I didn't like the cumbersome approach it used so I tried to implement it in my own way. This is what I lazily did. It turns out to be so simple that I thought I could use some opinion. Notice I didn't add `tag_replacement` param but that's just like one line of code.

string stripTags(string input, in string[] allowedTags = [])
{
	import std.regex: Captures, replaceAll, ctRegex;

	auto regex = ctRegex!(`</?(\w*)>`);

	string regexHandler(Captures!(string) match)
	{
	    string insertSlash(in string tag)
	    in
	    {
		assert(tag.length, "Argument must contain one or more 
characters");
	    }
	    body
	    {
	    	return tag[0..1] ~ "/" ~ tag[1..$];
	    }

	    bool allowed = false;
	    foreach (tag; allowedTags)
	    {
     		if (tag == match.hit || insertSlash(tag) == match.hit)
     		{
     			allowed = true;
     			break;
     		}
	    }
	    return allowed ? match.hit : "";
	}

	return input.replaceAll!(regexHandler)(regex);
}

unittest
{
	assert(stripTags("<html><b>bold</b></html>") == "bold");
	assert(stripTags("<html><b>bold</b></html>", ["<html>"]) == 
"<html>bold</html>");
}



I'm not sure the tags matching regex I used is the best though.


More information about the Digitalmars-d mailing list