Html can be used to obtain fragments of what to do?
Can be used to split, it can also analyze an HTML document
beautifulsoup usage?
Installation beautifulsoup library: pip install beautifulsoup4
Bs default because there is not enough libraries html compatibility, but also to install a library to implement: pip install html5lib
The following code is attached bs1.html screenshot:
By code implementation:
# BS operation object is a string, suppose you want to make a html text analysis, we must first read out the text string.
with open('bs.html',encoding='utf8') as f:
html_doc = f.read()
# Import the relevant library, html5lib do not import, BS will automatically quote
from bs4 import BeautifulSoup
# Designated by HTML5lib to parse the document
soup = BeautifulSoup (html.doc, 'html5lib') # The first parameter is to parse the text, a second parameter specifies the use of the library to parse htmllib
# Print (soup.title) # print out the contents of the first title of
# print(soup.find('title'))
# Print (soup.title.name) # get name tags
# Obtain tag (label) text
# print(soup.title.string)
# Can also:
# print(soup.title.get_text())
# If you want to get tag of the parent tag
# print(soup.title.parent)
# print(soup.title.parent.name)
# If you want to get the elements of the property value
# print(soup.div['id'])
# print(soup.p['style'])
# Print (soup.a) to find only the first label
# Print (soup.find_all ( 'a')) to find all of a label
# Print (soup.find_all ( 'a') [1]) to find a second tag in accordance with a subscript
# Print (soup.find ( 'a', id = 'link1')) based on the id attribute to find a corresponding tag
# Print (soup.find ( 'a', href = 'http: //example.com/lacie')) # according to the appropriate hyperlink to find a label
webdriver offers eight basic elements of positioning:
By selecting the element id attribute: find_element_by_id ()
By selecting an element name attribute: find_element_by_name ()
Element attributes selected by classs: find_element_by_class_name ()
By Tag (tag) selected element attributes: find_element_by_tag_name ()
By selecting link elements: find_element_by_link_text ()
By partial_link (vague matching) Location: find_element_by_partial_link_text ()
By xpath select element: find_element_by_xpath ()
By selecting css elements: find_element_by_css_selector ()