python爬虫豆瓣250排行书籍 - 代码天地

python爬虫豆瓣250排行书籍

其他 2018-11-08 08:41:14 阅读次数: 0

爬虫代码

# 豆瓣图书Top250

import requests
from bs4 import BeautifulSoup

for page in range(10):
    url = 'https://book.douban.com/top250?start={}'.format(page*25)
    r = requests.get(url).text
    bsObj = BeautifulSoup(r,'html.parser')
    td_tags = bsObj.find_all('td',{'valign':'top','width':None})
    #print(td_tags)
    for td_tag in td_tags:
        try:
            name = td_tag.find('a').get_text().strip('\n').replace('\n','').replace(' ','')
            info = td_tag.find('p',{'class':'pl'}).get_text()
            rating_nums = td_tag('div',{'class':'star clearfix'})[0].get_text().replace('\n','').replace(' ','')
            jianjie = td_tag.find('span',{'class':'inq'}).get_text()
            dd = name + '\n' + info + '\n' + rating_nums + '\n' + jianjie + '\n'
            #print(dd)
            with open('E:/douban_book/douban_book.txt','a+',encoding='utf-8') as f:
                f.write(dd + '\n')
        except:
            continue

如果报错 no module named requests
可以 pip install requests安装
在这里插入图片描述

no module named beautifulsoup
可以pip install beautifulsoup4 安装
在这里插入图片描述

查看装了那些扩展，成功与否
在这里插入图片描述

执行文件结果
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/ahaotata/article/details/83622743

python爬虫豆瓣250排行书籍

python数据分析之爬虫七：爬取豆瓣书籍排行榜Top250

python 爬虫豆瓣top250

python——爬虫（豆瓣top250）

python爬虫——豆瓣电影Top250

python爬虫豆瓣排名前250的电影

python、爬虫、豆瓣250、数据库

【python爬虫】—豆瓣电影Top250

java爬取豆瓣电影TOP250排行

豆瓣电影TOP250和书籍TOP250爬虫

python爬虫豆瓣推理书籍及链接

Python 爬虫实践-豆瓣电影Top250（待续）

python爬虫入门:豆瓣电影Top250抓取

Python爬虫（二）-再探豆瓣Top250

Python爬虫（一）-初探豆瓣Top250

Python爬虫获取豆瓣电影TOP250

【Python】Scrapy爬虫实战（豆瓣电影 Top 250）

Python 爬虫抓取豆瓣读书TOP250

（十七）Python爬虫：爬取豆瓣电影TOP250

python爬虫--爬取豆瓣top250电影名

python爬虫实践——爬取豆瓣书本头250

python 爬虫&爬取豆瓣电影top250

Python爬虫 - scrapy - 爬取豆瓣电影TOP250

python爬虫实践——爬取“豆瓣top250”

Python爬虫一：抓取豆瓣电影Top250

python爬虫入门 ✦ 爬取豆瓣电影Top250

python爬虫——豆瓣top250之scrapy框架

python爬虫 —— 豆瓣电影top250电影

python爬虫之爬取豆瓣电影top250

python爬虫——爬取豆瓣top250电影信息

今日推荐

Linus “吃狗粮”最积极！

开源日报 | Winamp播放器即将开源；生成式AI之战升级第二轮；Linus“吃狗粮”最积极；AI进入泡沫前期；吴泳铭为阿里云带来了什么？

NetBSD 禁止提交由 AI 生成的代码

Apache Doris 2.0.10 版本正式发布！

开源日报 | 大模型开战；大模型独角兽被曝卖身；周鸿祎建议谷歌开源所有产品；最大开源AI社区提供1000万美元共享GPU

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

周排行

SVN服务端安装在阿里云

实战 | 相机标定

webpack核心概念

note20——》只要肯低头吃苦，人生就会有救

PAT甲级 1062 Talent and Virtue （25 分）排序

NG Toolset开发笔记--5GNR Resource Grid（26）

如何对待上司

oracle命令

第9章 STL迭代器

logstash使用es映射模板

每日归档

更多

2024-05-20(36)

2024-05-19(0)

2024-05-18(4)

2024-05-17(34)

2024-05-16(6)

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)