BeautifulSoup的一些方法 - 代码天地

BeautifulSoup的一些方法

其他 2018-10-08 20:33:38 阅读次数: 0

1、首先要下载BeautifulSoup:

pip3 install BeautifulSoup4

2、

from bs4 import BeautifulSoup
s = '''
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>
<p class="story">...</p>
<script>alert(123)</script>
'''
bs=BeautifulSoup(s,"html.parser")

#打印出字符串s，不完整的标签会自动补全

print(bs)

#取到所用标签中内容

print(bs.text)

#每个标签当成一个元素，从外到内遍历

print(bs.find_all())

#找到所用的a标签

print(bs.find_all("a"))

#找到所有的body标签，虽然body不完整，但会自动补全的

print(bs.find_all("body"))

#找到每个a标签的href值

for tag in bs.find_all("a"):
print(tag.get("href"))

#找到每个a标签的name属性值

for tag in bs.find_all():
print(tag.name)

if tag.name in ["script","link"]:

tag.decompose() # 去除标签script和link

# 打印出去除标签后的字符串

print(str(bs))

# 打印出去除字符串后的文本内容

print(bs.text)

猜你喜欢

转载自www.cnblogs.com/fangsheng/p/9756866.html

BeautifulSoup的一些方法

python爬虫日志（7）BeautifulSoup的一些简单知识

爬虫解析库BeautifulSoup的一些笔记

函数的一些方法

JQuery的一些方法

一些排错方法

Json 的一些方法

localstorage的一些方法

Object 的一些方法

android一些方法

列表的一些方法

一些LitJson的方法

MonoBehaviour的一些方法

类的一些方法

一些简单的方法

js 一些方法

js的一些方法

Math的一些方法

Array的一些方法

队列的一些方法

一些跳转方法

一些工具方法

一些常用的方法

list的一些方法

一些继承方法

数组的一些方法

静心的一些方法

EasyUi 的一些方法

request的一些方法

file的一些方法

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)