Html escaping for security: howto in D?

aberba karabutaworld at gmail.com
Tue Jul 7 20:10:14 UTC 2020


On Tuesday, 7 July 2020 at 17:55:44 UTC, Fitz wrote:
> On Monday, 6 July 2020 at 14:57:22 UTC, aberba wrote:
>> utilities...a very long long time ago...two yrs 😜. See 
>> https://code.dlang.org/packages/sanival for stripTags()
>> Its a very limited implementation and uses std.regex which 
>> many people here who are critical about performance will speak 
>> against. I'm yet to see an alternative. So you could use that 
>> if you don't find a better alternative.
>>
>
> Can't see stripTags? in https://code.dlang.org/packages/sanival

string stripTags(string input, in string[] allowedTags = [])
{
	import std.regex: Captures, replaceAll, ctRegex;

	auto regex = ctRegex!(`</?(\w*)>`);

	string regexHandler(Captures!(string) match)
	{
	    string insertSlash(in string tag)
	    in
	    {
		assert(tag.length, "Argument must contain one or more 
characters");
	    }
	    body
	    {
	    	return tag[0..1] ~ "/" ~ tag[1..$];
	    }

	    bool allowed = false;
	    foreach (tag; allowedTags)
	    {
     		if (tag == match.hit || insertSlash(tag) == match.hit)
     		{
     			allowed = true;
     			break;
     		}
	    }
	    return allowed ? match.hit : "";
	}

	return input.replaceAll!(regexHandler)(regex);
}

unittest
{
	assert(stripTags("<html><b>bold</b></html>") == "bold");
	assert(stripTags("<html><b>bold</b></html>", ["<html>"]) == 
"<html>bold</html>");
}



More information about the Digitalmars-d mailing list