python：爬虫练习爬取小说(初学) - 代码天地

python：爬虫练习爬取小说(初学)

其他 2019-07-29 11:08:14 阅读次数: 0

 1 import requests
 2 from pyquery import PyQuery as pq 
 3 
 4 def get_content(a):
 5     response=requests.get(a)
 6     #print(str(response))    
 7     response.encoding = 'gbk'
 8     #print(response.text)
 9     doc = pq(response.text)
10     text=doc('#content.showtxt')
11     a=str(text)
12     b=a.replace("&#13;<br/>&#13;<br/>","\n").replace('<br/><br/>','\n').replace('<script>chaptererror();</script><br/>　请记住本书首发域名：www.biqugexsw.com。笔趣阁小说网手机版阅读网址：m.biqugexsw.com</div>','').replace('\xa0','').replace('<div id="content" class="showtxt">','')
13     file = open(u'F:\python\小说\1.txt','a+')
14     file.close()
15 def get_mulu():
16     index_url='https://www.biqugexsw.com/75_75362/'#可替换其他书籍网页
17     response=requests.get(index_url)
18     response.encoding = response.apparent_encoding
19     doc = pq(response.text)
20     urls = doc('div.listmain a')
21     for i in urls.items():
22         a='https://www.biqugexsw.com/'+i.attr.href #获取每个章节的URL
23         get_content(a)
24         print("获取成功")
25         #print(a)
26 get_mulu()

最近学习爬虫，练习爬取笔趣阁的一部小说。

待完善：
　　浏览器模拟访问

　　异步爬取

　　获取bookname

　　正则表达式

猜你喜欢

转载自www.cnblogs.com/liubingzhe/p/11262691.html

python：爬虫练习爬取小说(初学)

python爬虫练习2：通过Python爬取小说

Python爬虫——爬取小说

Python爬虫练习二：爬取笔趣阁小说

Python爬虫—爬取小说名著

Python爬虫爬取网站小说

python爬虫之爬取网站小说

python爬虫爬取网站小说

Python爬虫练习爬取网络小说保存到txt

爬虫练习-爬取《斗破苍穹》全文小说

爬虫练习——爬取纵横小说网

Python爬取小说

python 爬取小说

Python爬虫层层递进，从爬取一章小说到爬取全站小说

如何用python爬虫从爬取一章小说到爬取全站小说

用PYTHON爬虫简单爬取网络小说

python爬虫五：爬取小说，下载到本地

(二）Python爬虫笔记--爬取网站小说

Python爬虫实战：爬取全站小说排行榜

Python爬虫初级案例——爬取网络小说

python爬虫入门之爬取小说.md

Python爬虫系列之小说网爬取

python从爬虫基础到爬取网络小说实例

python爬虫进阶使用多线程爬取小说

Python爬虫实战项目之小说信息爬取

python新人的爬虫“出师表”——————小说爬取试验

python3爬虫-使用requests爬取起点小说

【Python爬虫】爬取网络小说（转）

Python爬虫-爬取17K小说

python爬虫--一次爬取小说的尝试

今日推荐

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

报告：Django 仍然是 74% 开发者的首选

《2024 年一季度互联网投融资运行情况》研究报告

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

周排行

NEFU 117 素数个数的位数

Closest Common Ancestors (Lca,tarjan)

ELK部署

【转载】Hive笔记整理（三）

SQL语句（一）基本表的定义

关于Java web开发中的MySQL的事务语句

MFC创建自定义窗体

如何用一句话激怒程序员？

《逆袭大学》文摘——9.4 基础和应用的平衡中找到大学的节奏

【spring源码分析】@Value注解原理

每日归档

更多

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)