bs4 parse xml file

Use BS4 parse XML file usage

1. html.parser

from BS4 Import BeautifulSoup

Soup = BeautifulSoup (html, "html.parser")
two arguments: The first argument is to parse html text, the second parameter is the kind of use parsers for HTML terms is html.parser, this is bs4 built-in parser
2.

 soup = BeautifulSoup(html, "lxml")

Find all qualified label

a) use the tag to find

soup.find_all ( ' b ') 
b) regular expression search
soup.find_all (re.compile ( " ^ b "))

c) Find a list provided by tab
soup.find_all(["a", "b"])
 

Reference links:

https://www.cnblogs.com/gl1573/p/9480022.html
  

Guess you like

Origin www.cnblogs.com/i-shu/p/11487438.html