python爬虫如何下载高清图片 - 代码天地

python爬虫如何下载高清图片

其他 2019-04-15 10:41:26 阅读次数: 0

代码编写

爬虫编写

提取精选图片页面中的套图链接

detail_urls = response.xpath("//ul[@class='content']/li/a/@href").getall()

精选图片页面中下一页的处理

next_page = response.xpath("//div[@class='pageindex']/a[last()-1]/@href").get()
        if next_page:
            yield scrapy.Request(url=response.urljoin(next_page), callback=self.parse)

从套图页面中提取列表模式的链接

list_pattern = response.xpath("//*[@id='cMode']/div/div[@class='side']/script").get()  # 提取列表模式的URL
        list_pattern = re.findall("/photolist/.*.html", list_pattern)[0]  # 匹配列表模式的url

从列表模式中下载高清大图

category = response.xpath("//div[@class='mini_left']/a[last()-1]/text()").get()
image_urls = response.xpath("//ul[@id='imgList']/li/a/img/@src").getall()
        image_urls = list(map(lambda x: x.replace("t_", ""), image_urls))  # 去除url中的"t_"得到高清大图
        image_urls = list(map(lambda x: response.urljoin(x), image_urls))
        yield CarhomehdItem(category=category, image_urls=image_urls)

编写ItemPipeline保存图片

class ImagePipeline(ImagesPipeline):
    def get_media_requests(self, item, info):
        request_objs = super(ImagePipeline, self).get_media_requests(item, info)
        for request_obj in request_objs:
            request_obj.item = item
        return request_objs

    def file_path(self, request, response=None, info=None):
        path = super(ImagePipeline, self).file_path(request, response, info)
        category = request.item.get("category")
        image_store = IMAGES_STORE
        category_path = os.path.join(image_store, category)
        if not os.path.exists(category_path):
            os.mkdir(category_path)
        image_name = path.replace("full/", "")
        image_path = os.path.join(category_path, image_name)
        return image_path

猜你喜欢

转载自blog.csdn.net/qwertyuiopasdfgg/article/details/89295703

python爬虫如何下载高清图片

Python爬虫实战批量下载高清美女图片

Python爬虫实战（六）——使用代理IP批量下载高清小姐姐图片（附上完整源码）

python爬虫批量下载高清大图

Python 下载Bing首页高清图片

Python爬虫实战（七）——使用代理IP批量下载4K高清小姐姐图片（附上完整源码）

python爬虫图片下载

Python爬虫requests 下载图片

python 爬虫批量下载图片

python爬虫：爬取某网站高清图片

python爬虫：爬取某网站高清图片

python爬虫-XPath实例——批量下载高清壁纸

python爬虫入门 ✦ 下载王者荣耀全皮肤（高清壁纸）

python爬虫入门 ✦ 下载王者荣耀全皮肤（高清壁纸）

python 爬虫--下载图片,下载音乐

python 高清壁纸下载

高清图片免费下载网站

高清图片免费下载网站

python爬虫.3.下载网页图片

python爬虫日志（4）下载图片

Python学习---网页爬虫[下载图片]

python 2.7 图片下载爬虫

python爬虫-简单使用xpath下载图片

python爬虫之下载京东页面图片

Python爬虫：多线程下载图片

Python爬虫打造图片下载器

python爬虫下载图片--艺术网站

python实现爬虫批量下载图片

Python爬虫实战：批量下载网站图片

python图片爬虫 - 实现unsplash批量下载

今日推荐

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

国产云输入法——仅华为无云端数据上传安全问题

开源日报 | 工业开源项目OGG 1.0；姐姐，你要和我一起配置火狐吗；苹果AI遥遥落后？Fedora 40

开放签电子签章：停止新增，优化体验，前进更进（五一假期前工作）

周排行

Metasploit文件目录与入侵基本概念

跨域(CORS)请求问题[No 'Access-Control-Allow-Origin' header is present on the requested resource]常见解决方案

CodeIgniter 源码解读之 CodeIgniter.php（二）

SAS入门之（四）改变数据类型

初识元组

[数学建模]数学建模算法和模型（B站视频）（二）

Nginx 服务器源码安装配置流程

C#实现语音视频录制【基于MCapture + MFile】

开发进度4

下载安装vue的方法网址

每日归档

更多

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)

2024-04-20(6)

2024-04-19(5)