【学习笔记】python3 爬虫-百度图片 - 代码天地

【学习笔记】python3 爬虫-百度图片

其他 2020-04-13 20:50:50 阅读次数: 0

import requests
import re,time,random,os
from urllib import parse
from fake_useragent import UserAgent

class BaiduImgSpider(object):
    def __init__(self):
        self.baseurl = 'https://image.baidu.com/search/index?tn=baiduimage&word={}'
        self.count = 1;
        self.ua = UserAgent()
        self.savepath = '/home/user/work/spider/day03/'
        self.re_str = r'{"thumbURL":"(.*?)","replaceUrl":'

    def get_html(self,name,orgname):
        header = {'User-Agent':self.ua.random}
        url = self.baseurl.format(name)
        html = requests.get(url=url,headers = header).text
        pattent = re.compile(self.re_str,re.S)
        img_list = pattent.findall(html)
        path = self.savepath+orgname
        if not os.path.exists(path):
            os.mkdir(path)
        for img_link in img_list:
            print(img_link)
            self.save_img(img_link,path)
            time.sleep(random.randint(1,2))

    def save_img(self,url,path):
        header = {'User-Agent': self.ua.random}
        html = requests.get(url=url,headers=header).content
        filename = path+"/"+str(self.count)+'.jpg'
        with open(filename,'wb') as f:
            f.write(html)
            print('下载成功',filename)
        self.count += 1

    def run(self):
        search_name = input('输入要获取的名字>');
        word = parse.quote(search_name)
        self.get_html(word,search_name)


if __name__ == '__main__':
    spider = BaiduImgSpider()
    spider.run();

直接上代码了，非常简单的

猜你喜欢

转载自www.cnblogs.com/nightnine/p/12693731.html

【学习笔记】python3 爬虫-百度图片

百度图片爬虫 python3实现

python 百度图片爬虫

python爬虫(百度图片)

Python——百度图片爬虫

【python--爬虫】百度图片爬虫

百度图片爬虫-python版-如何爬取百度图片?

python3 学习 3：python爬虫之爬取动态加载的图片，以百度图片为例

Python爬虫案例：爬取百度图片

python爬虫，爬取百度图片

python 爬虫, 抓取百度美女吧图片

python爬虫爬取百度贴吧图片

python爬虫小程序,爬取百度图片

python爬虫爬取百度图片

python爬虫实例之百度图片的批量下载

Python爬虫实现百度图片自动下载

python爬虫模拟登录爬取百度图片

python爬虫——批量下载百度图片

python爬虫篇2：爬取百度图片

python3编写爬虫从百度图库中爬取图片

分享python3爬虫爬取百度上的图片

python3简单爬虫，访问百度

百度地图POI爬虫(Python3)

Python爬虫学习笔记二：百度贴吧网页图片抓取

python3 爬取百度图片

使用python3爬取百度图片

python3爬取百度图片

Python爬虫系列（二）——Python爬虫批量下载百度图片

python爬虫爬取百度图片，按特定关键词实现主题爬虫

python爬虫之爬取动态加载的图片，以百度图片为例【原理讲解】

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

让自己的头脑极度开放

CentOS 6.5(x64) 和Redhat6.5操作系误删libc

高可用注册中心

【日记】12.28/【题解】AtCoder AGC041

XML（5）_XML 约束_DTD

Java集合Map（四）

树梅派安装桌面环境教程

pipenv 的使用和安装

小程序白屏问题和内存研究

C语言简单选择排序

每日归档

更多

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)