简单爬取小姐姐的照片

使用环境

win10系统,python

先简单开始访问,获取html从而提取所需数据
import requests
url = 'https://www.douyu.com/g_yz'
response = requests.get(url=url)
html = response.text
print(html)
将输出结果往下拉,指导看到jpg相关的如下图所示在这里插入图片描述
接着就利用简单的正则表达式进行提取
import re
title_url = re.findall(r'"rn":"(.*?)","rpos":0,"rs1":"(.*?)"',html)
for title,one_url in title_url:
    print(title+"=================="+one_url)

下图便是相关结果
在这里插入图片描述

对于一个图片的下载如下所示
with open('一贫如洗的直播间 5695362.jpg','wb') as f:
    resp = requests.get(url='https://rpic.douyucdn.cn/live-cover/appCovers/2020/06/21/5695362_20200621173529_big.jpg/dy2').content
    f.write(resp)

下面是保存成功的图片
在这里插入图片描述

那么在一个循坏里也是同理的
for title,one_url in title_url:
    with open(title+'.jpg','wb') as f:
        resp = requests.get(url=one_url).content
        f.write(resp)
    print(title+'======================保存成功')

输出结果:
在这里插入图片描述

看成品

在这里插入图片描述
优化后的源码如下:

import requests
import re
import os
import time

url = 'https://www.douyu.com/g_yz'
response = requests.get(url=url)
html = response.text


title_url = re.findall(r'"rn":"(.*?)","rpos":0,"rs1":"(.*?)"',html)
os.chdir('小姐姐\\')
for title,one_url in title_url:
    with open(title+'.jpg','wb') as f:
        resp = requests.get(url=one_url).content
        f.write(resp)
    print(title+'======================保存成功')
    time.sleep(0.5)

猜你喜欢

转载自blog.csdn.net/A728848944/article/details/108009311
今日推荐