Installation lxml, engine (parser)
soup=BeautifulSoup(html_doc,features="lxml")
tag = soup.select ( '# link2') way selector
tag.name get the tag name
children: son and label content is not the same type
descendants: descendants
clear: Clears the reserved label name decompose: delete, label names are not retained
extract: Remove and return a value (label removed )
encode: the object into byte type decode: the object into a string type
recursive = True if the recursive find
soup.find (class_ = '') class to write out attrs avoid to underline the definition of class conflict and class keyword
Yes wildcard in addition to newline \ n
tag.get ( 'id') acquires attribute tag
www.cnblogs.com/wupeiqi/articles/6283017.html
is_empty_element is empty or self-closing tag label
tag.string not only acquire but also modify the label content
Create Label: obj = Tag (name = 'div', attrs = { 'id': 'it'})
jquery.cuishifeng.cn jquery Daquan
tag.wrap (obj) to the tag label wrapped obj
tag.unwrap () to remove the current label, the package retains its label