python爬虫系列(3.7-使用 bs4 爬取获取贵州农产品) - 代码天地

python爬虫系列(3.7-使用 bs4 爬取获取贵州农产品)

其他 2018-11-10 03:04:00 阅读次数: 0

一、爬取数据步骤

1、爬取网站地址

2、实现代码

import requests

from bs4 import BeautifulSoup

class Food(object):

def __init__(self):

self.url = 'http://www.gznw.gov.cn/priceInfo/getPriceInfoByAreaId.jx?areaid=22572&page=1'

self.headers = {

'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36',

}

def get_html(self):

"""

抓取网页

:return:

"""

response = requests.get(url=self.url, headers=self.headers)

if response.status_code == 200:

return response.text

return ''

def down_data(self):

"""

下载数据

:return:

"""

soup = BeautifulSoup(self.get_html, 'lxml')

table = soup.find('table', attrs={'class': 'table table-hover'})

trs = table.find('tbody').find_all('tr')

food_list = []

for tr in trs:

food_dict = {}

tds = tr.find_all('td')

name = tds[0].get_text()

price = tds[1].get_text()

address = tds[3].get_text()

time = tds[4].get_text()

food_dict['name'] = name

food_dict['price'] = price

food_dict['address'] = address

food_dict['time'] = time

food_list.append(food_dict)

return food_list

if __name__ == "__main__":

foo = Food()

print(foo.down_data())

猜你喜欢

转载自blog.csdn.net/qq_40925239/article/details/83863215

python爬虫系列(3.7-使用 bs4 爬取获取贵州农产品)

python使用bs4爬取boss静态页面

Python使用bs4爬取 cnblogs

爬虫--爬取网页图片--bs4

爬虫_BS4

爬虫-BS4

python的-bs4

bs4的使用

bs4使用

使用bs4爬豆瓣小说名

python爬虫—使用bs4爬取链家网的房源信息

网络爬虫 - 4 bs4的使用方法与爬取案例

bs4爬虫入门

request、bs4爬虫

[Python 爬虫之路1] 爬取糗事百科（requests，bs4）

Python网络爬虫requests、bs4爬取空姐图片，福利哦

Python网络爬虫requests、bs4爬取空姐网图片

python爬虫学习（十一）bs4解析爬取三国演义

Python Bs4 回顾

Python之BS4

python(BS4模块)

python bs4库

python bs4 BeautifulSoup

python安装bs4

python 之Bs4

Bs4 简单使用

python爬虫入门之————————————————第四节--使用bs4语法获取数据

爬虫之爬取图片（运用了bs4和正则查取）

python接口自动化测试十八：使用bs4框架爬取图片

python2使用bs4爬取腾讯社招

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)