python web page source code to extract the contents of the comment unconventional methods

as follows:

<-! <Span class = "flag"> experience new template </ span> ->

If we need to extract <! - -> content, by BeautifulSoup method will ignore <! - -> content

Then we can make a replacement process for the web page source code, the <-! All replaced empty string

res3 = requests.get(url,headers=headers,timeout=(10,60)).content

= RES3 HTML1
HTML = the eval (the repr (HTML1) .replace ( '<-!', '')) # sentence comment page source code to replace a portion of
soup = BeautifulSoup (html, 'html.parser ')

By this method to find BeautifulSoup span class = "flag"

Released three original articles · won praise 1 · views 676

Guess you like

Origin blog.csdn.net/stkj007/article/details/104067626