爬虫--lxml爬取妹子图 - 代码天地

爬虫--lxml爬取妹子图

其他 2019-02-22 20:26:21 阅读次数: 0

版权声明：本文为博主原创文章，未经博主允许不得转载。 https://blog.csdn.net/MR_HJY/article/details/81878849

import requests
from lxml import etree
import os

# 下载图片
def  download_img(img_url,referer):
    # print(img_url)
    headers = {
        'referer': referer,
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
    }
    # print(headers)
    if os.path.exists('download'):
        pass
    else:
        os.mkdir('download')
    filename = 'download/'+img_url.split('/')[-1]
    response = requests.get(img_url,headers=headers)
    with open(filename,'wb') as f:
        f.write(response.content)

# 图片的获取
def pares_detailed_page(url_href):
    response = requests.get(url_href)
    html_element = etree.HTML(response.text)
    max_page = html_element.xpath('//div[@class="pagenavi"]/a/span/text()')[-2]
    # print(max_page)
    for i in range(1,int(max_page)+1):
        page_url = url_href + '/' + str(i)
        response = requests.get(page_url)
        html_element = etree.HTML(response.text)
        img_url = html_element.xpath('//div[@class="main-image"]/p/a/img/@src')[0]
        download_img(img_url,url_href)


url = 'http://www.mzitu.com/'
response = requests.get(url)
# with open('meizi.html','wb') as f:
#     f.write(response.content)

html_element = etree.HTML(response.text)
href_list = html_element.xpath('//ul[@id="pins"]/li/a/@href')
for href in href_list:
    # print(href)
    pares_detailed_page(href)

猜你喜欢

转载自blog.csdn.net/MR_HJY/article/details/81878849

爬虫--lxml爬取妹子图

爬虫爬取清纯妹子图

[python爬虫]爬取妹子图

Python爬虫教程：爬取妹子图

爬虫--多进程爬取妹子图

python爬虫——爬取妹子图

爬取妹子图

萌新爬虫的动力就是爬取妹子图！批量爬取妹子图哟！

爬虫爬妹子图

Node.js爬取妹子图-crawler爬虫的使用

python 爬虫爬取煎蛋网妹子图

Python 爬虫入门(二)——爬取妹子图

[python爬虫] 使用多进程爬取妹子图

Python爬虫——利用requests模块爬取妹子图

Python 爬虫入门之爬取妹子图

多线程爬取妹子图 python 爬虫

多线程爬虫爬取妹子图网站

Python爬虫入门教程：爬取妹子图网站

Python爬虫入门【2】：妹子图网站爬取

Python之Scrapy爬虫实战--爬取妹子图

Python 爬虫（清纯）妹子图爬取（代码自由奔放）

python爬虫30秒爬取1000张妹子图

python爬虫学习（九）妹子图分页爬取

【Python爬虫】使用代理爬取妹子图

简单爬取妹子图

python爬取妹子图

妹子图图片爬取

python爬虫-爬妹子图

爬虫练习--爬妹子图

爬虫---lxml爬取博客文章

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)