☞ ░Go to LaoYuanPython blog https://blog.csdn.net/LaoYuanPython ░
I. Introduction
BeautifulSoup is a class for HTML parsing provided in the third-party module bs4. It can be considered as an HTML parsing toolbox. It has a better fault-tolerant recognition function for tags in HTML messages. Reading this section requires basic knowledge of HTML. If For insufficient knowledge in this area, please refer to the introduction in the previous chapter.
Two, BeautifulSoup installation, import and create objects
2.1, install BeautifulSoup and lxml
BeautifulSoup is a class of bs4 module, lxml is an html text parser, execute the command to install bs4 module and lxml parser module in operating system command line:
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple bs4
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple lxml
2.2, load the module where BeautifulSoup is located
Because BeautifulSoup is a class provided by the bs4 module, it is generally used when importing:
from bs4 import BeautifulSoup