Django filtering xss attacks

 

       XSS is a common cross-site scripting attack, and this type of error is not easy to be found or ignored by developers. Of course, the django framework itself has this consideration, for example, escape is automatically enabled in the template, that is, html conversion righteous. What is escaping? It is to filter out the keywords of the html language. For example, <div> is the keyword of html, if you want to render <div> on the html page, its source code must be <div>. And if escaping is turned off, then over

  For example, instead of using a rich text editor in place of the comment box, let the user enter the content themselves, if a user enters something like the following:

 

Here is my comment, <script>alert('xss injection');</script>

 

  And I used {{ comment| safe }} in the template like this. Because of the safe filter, a dialog box will pop up directly here. This is XSS injection. Such a situation is not allowed in real projects. The purpose of using safe is to better display html tags and so on.

  Because django itself has a series of methods. These methods are in the django.utils.html package

from django.utils.html import escape, strip_tags, remove_tags

  

 

E.g:

   Remove HTML tags using the string that appears in the strip_tags function:

# import the strip_tags
from django.utils.html import strip_tags
# simple string with html inside.
html = '<p>paragraph</p>'
print html # will produce: <p>paragraph</p>
stripped = strip_tags(html)
print stripped # will produce: paragraph

  

The same applies as a filter:

{{ somevalue|striptags }}

  

To remove special tags, you need to use removetags

 

html = '<strong>Bold...</strong><p>paragraph....</p>'
stripped = removetags(html, 'strong') # removes the strong only.
stripped2 = removetags(html, 'strong p') # removes the strong AND p tags.

The same applies to templates:

{{ value|removetags:"a span"|safe }}

  

Here's a lazy one:

The lxml module has a clearhtml method, which can filter the content into clean HTML content through the following code.

from lxml.html.clean import clean_html
html = clean_html(html)

  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324649972&siteId=291194637