MacOS下安装BeautifulSoup库及使用

BeautifulSoup简介


BeautifulSoup库是一个强大的python第三方库,它可以解析html进行解析,并提取信息。

安装BeautifulSoup


  • 打开终端,输入命令:
pip3 install beautifulsoup4

BeautifulSoup库小测


  • 查看它的源代码:

  • 用request库获得源代码(存放在变量demo中):
>>> import requests
>>> r = requests.get("http://python123.io/ws/demo.html")
>>> r.text
'<html><head><title>This is a python demo page</title></head>\r\n<body>\r\n<p class="title"><b>The demo python introduces several python courses.</b></p>\r\n<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:\r\n<a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>\r\n</body></html>'
>>> demo = r.text
  • 导入BeautifulSoup库
>>> from bs4 import BeautifulSoup
>>> 
  • 使用BeautifulSoup库解析html信息
>>> demo = r.text
>>> soup = BeautifulSoup(demo,'html.parser')
>>> print(soup.prettify)
<bound method Tag.prettify of <html><head><title>This is a python demo page</title></head>
<body>
<p class="title"><b>The demo python introduces several python courses.</b></p>
<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
<a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a> and <a class="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>.</p>
</body></html>>
>>> 

如何使用BeautifulSoup库?

  • 代码框架:
from bs4 import BeautifulSoup
soup = BeautifulSoup('<p>data</p>','html.parser')
  • 其中BeautifulSoup的两个参数:
    • 第一个代表我们要解析的html格式的信息。
    • 第二个代表解析所使用到的解析器

猜你喜欢

转载自www.cnblogs.com/031602523liu/p/9824907.html