使用python爬取小说（附python源码） - 代码天地

使用python爬取小说（附python源码）

其他 2021-03-03 01:07:33 阅读次数: 0

import requests ###爬虫模块,获取网页文本
import re       ###正则表达式模块,从网页文本中提取所需要的信息
###### gettext(url):输入网站链接 url,返回该网站的文本
def gettext(url):
    r = requests.get(url,timeout=30)
    r.encoding = 'apparent_encoding'
    return r.text
###### 输入目录链接 url,返回各章节链接数组
def geturl(url):
    text=gettext(url)
    chapter_info_list=re.findall(r'<li><a href="(.*?)">',text)
    del(chapter_info_list[0])
    return chapter_info_list
###### 输入网站 url,返回该网站文本数组
def getline(url):
    text = gettext(url)
#print(text,file=open("序章.txt",'a',encoding='utf-8'))
    title=re.findall(r'<h1>(.*?)</h1>',text)
    line=re.findall(r'<span class="calibre[2-9]">(.*?)</span>',text)
    all = title+line
    return(all)
##### 输入数组,生成txt文件
def my_print(line,my_name):
    for i in line:
        print(i+'\n',file=my_name)
##### 主函数
def main():
    my_file=open("龙族.txt",'x',encoding='utf-8')
    url='http://www.yuedu88.com/longzu1/'
    url_list=geturl(url)
    for i in url_list:
        line=getline(i)
        my_print(line,my_file)
main()

2021年2月23日12:39:57

猜你喜欢

转载自blog.csdn.net/Infinity_07/article/details/113982240

使用python爬取小说（附python源码）

使用python爬取小说

python 爬取小说

Python爬取小说

使用python3爬取小说

Python爬虫实战，requests+openpyxl模块，爬取小说数据并保存txt文档（附源码）

python爬取热门小说

python爬取小说并下载

Python BeautifulSoup爬取小说

python爬取小说详解

python之爬取小说

1)python 爬取小说

Python爬取小说实例

Python爬虫——爬取小说

python爬取起点小说

python爬虫进阶使用多线程爬取小说

python3爬虫-使用requests爬取起点小说

python爬取全书网小说

python爬取小说详解（一）

Python 爬取笔趣阁小说

python爬取网络小说

python爬取小说（四）代码优化

python爬取小说（三）数据存储

Python爬虫—爬取小说名著

Python爬取新笔趣阁小说

Python爬取网页所有小说

用python爬取小说章节内容

python爬取笔趣阁小说

用Python爬取某网站小说

用python爬取豆瓣小说

今日推荐

NetBSD 禁止提交由 AI 生成的代码

Apache Doris 2.0.10 版本正式发布！

开源日报 | 大模型开战；大模型独角兽被曝卖身；周鸿祎建议谷歌开源所有产品；最大开源AI社区提供1000万美元共享GPU

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

周排行

static方法和非static方法的区别（java）

如何查找计算机专业paper

java.lang.ClassFormatError: Incompatible magic value 0 in class file com/sitecha

跳跃游戏II

stm32_之【建立工程】

TeaWeb v0.0.9 发布，统计底层优化、主机监控功能改进

事件分发 -----控制字体大小

JavaScript DOM练习（动态表格添加） December 25，2019

JSF Scope & CDI

实现从零搭建一个登录注册页面（附源代码）

每日归档

更多

2024-05-19(0)

2024-05-18(4)

2024-05-17(34)

2024-05-16(6)

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)