My simple implementation of PHP strip_tags()
aberba via Digitalmars-d
digitalmars-d at puremagic.com
Wed Jun 28 11:08:12 PDT 2017
I wanted strip_tags() for sanitization in vibe.d and I set out
for algorithms on how to do it and came across this JavaScript
library at
https://github.com/ericnorris/striptags/blob/master/src/striptags.js which is quite popular judging by the number of likes and forks. As a looked through, I didn't like the cumbersome approach it used so I tried to implement it in my own way. This is what I lazily did. It turns out to be so simple that I thought I could use some opinion. Notice I didn't add `tag_replacement` param but that's just like one line of code.
string stripTags(string input, in string[] allowedTags = [])
{
import std.regex: Captures, replaceAll, ctRegex;
auto regex = ctRegex!(`</?(\w*)>`);
string regexHandler(Captures!(string) match)
{
string insertSlash(in string tag)
in
{
assert(tag.length, "Argument must contain one or more
characters");
}
body
{
return tag[0..1] ~ "/" ~ tag[1..$];
}
bool allowed = false;
foreach (tag; allowedTags)
{
if (tag == match.hit || insertSlash(tag) == match.hit)
{
allowed = true;
break;
}
}
return allowed ? match.hit : "";
}
return input.replaceAll!(regexHandler)(regex);
}
unittest
{
assert(stripTags("<html><b>bold</b></html>") == "bold");
assert(stripTags("<html><b>bold</b></html>", ["<html>"]) ==
"<html>bold</html>");
}
I'm not sure the tags matching regex I used is the best though.
More information about the Digitalmars-d
mailing list