3.2-Installation of Beautiful Soup

Beautiful Soup is an HTML or XML parsing library for Python. We can use it to easily extract data from web pages. It has a powerful API and various parsing methods. This section will take a look at its installation method.

1. Related Links

2. Preparation

Beautiful Soup's HTML and XML parsers are dependent on the lxml library, so please make sure you have successfully installed the lxml library before proceeding. Please refer to the previous section for specific installation methods.

3. pip installation

Currently, the latest version of Beautiful Soup is version 4.x, and the development of the previous version has stopped. It is recommended to use pip to install, the installation command is as follows:

pip3 install beautifulsoup4

After the command is executed, the installation can be completed.

4. Wheel installation

Of course, we can also download the wheel file from PyPI to install, the link is as follows: https://pypi.python.org/pypi/beautifulsoup4

Then use pip to install the wheel file.

5. Verify the installation

After the installation is complete, you can run the following code to verify:

from bs4 import BeautifulSoup
soup = BeautifulSoup('<p>Hello</p>', 'lxml')
print(soup.p.string)

The results of the operation are as follows:

Hello

If the running results are consistent, the installation is successful.

Note that although we installed the beautifulsoup4 package here, it was bs4 when it was introduced. This is because the library folder name of the package source code itself is bs4, so after the installation is complete, the library folder is moved to the native Python3 lib library, so the recognized library file name is called bs4.

Therefore, the name of the package itself is not necessarily the same as the name of the imported package when we use it.

Guess you like

Origin blog.csdn.net/wu347771769/article/details/84071117