python 爬虫之BeautifulSoup 库的基本使用 - 代码天地

python 爬虫之BeautifulSoup 库的基本使用

编程语言 2018-10-23 12:21:07 阅读次数: 0

import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
values = {}
values['name'] = 'Michael Foord'
values['location'] = 'Northampton'
values['language'] = 'Python'

data = urllib.urlencode(values) #数据进行编码生成get方式的请求字段
req = urllib2.Request(url,data) #作为data参数传递到Request对象中 POST方式访问
response = urllib2.urlopen(req) 返回一个类文件对象
the_page = response.read()
soup = BeautifulSoup(the_page，"html.parser") 通过类文件the_page 创建beautifulsoup对象，soup的内容就是页面的源码内容
构造好BeautifulSoup对象后，借助find()和find_all()这两个函数，可以通过标签的不同属性轻松地把繁多的html内容过滤为你所想要的
url_name = line.get('href') 获取a标签的url信息
Title = line.get_text().strip() 获取a标签的文本内容

猜你喜欢

转载自blog.51cto.com/weadyweady/2307779

python 爬虫之BeautifulSoup 库的基本使用

python爬虫之BeautifulSoup库

python爬虫之beautifulsoup的使用

python之爬虫（八）BeautifulSoup库的使用

Python爬虫之BeautifulSoup库(六)：输出

python爬虫之BeautifulSoup库程序笔记

Python网络爬虫之BeautifulSoup库

Python爬虫【解析库之beautifulsoup】

Python爬虫之BeautifulSoup和requests的使用

Python爬虫之Beautifulsoup模块的使用

Python爬虫之BeautifulSoup使用指南

python爬虫入门四：BeautifulSoup库(转) python爬虫从入门到放弃（六）之 BeautifulSoup库的使用 python爬虫从入门到放弃（六）之 BeautifulSoup库的使用

Python爬虫开发系列之四》BeautifulSoup解析库的使用

python爬虫从入门到放弃（六）之 BeautifulSoup库的使用

python爬虫_BeautifulSoup库使用

Python爬虫-BeautifulSoup 库

python爬虫——BeautifulSoup库

python爬虫之BeautifulSoup

python爬虫之BeautifulSoup

Python爬虫解析库之BeautifulSoup解析库详解

Python爬虫之BeautifulSoup库(五)：修改文档树

Python爬虫之BeautifulSoup库(四)：搜索文档树

Python爬虫之BeautifulSoup库(三)：遍历文档树

Python爬虫之BeautifulSoup库(二)：对象的种类

Python爬虫之BeautifulSoup库(一)：介绍与快速开始

python数据分析之爬虫二：BeautifulSoup库

python爬虫之BeautifulSoup4库的简单用法

Python爬虫之BeautifulSoup库——爬取大学排名

python爬虫之数据解析（一）：BeautifulSoup4库

python 爬虫之beautifulsoup（bs4）使用

今日推荐

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

周排行

rbac——界面、权限

Apache CXF + SpringMVC 整合发布WebService

so插件化

Vue.js实战系列---图标字体制作（svg格式）

PAT乙级 1007 素数对猜想(孪生素数对) (20分) ---（C语言 + 详细注释）

被IRM保护的文档，打开失败

Calendar和Date计算日期差的小问题

win10子系统ubuntu18.4安装docker

利用Wrap Shell Script定位Android Native内存泄漏

MySQL: Transaction (Part I - Basic Concept)

每日归档

更多

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)