python 爬虫抓取网站img图片 - 代码天地

python 爬虫抓取网站img图片

其他 2018-10-22 20:41:24 阅读次数: 0


from getHtml import getHtmlWinthIp
from getHtml import  getHtml
from bs4 import BeautifulSoup
from urllib import request#为了存储
import os #为了创建文件夹
imgsrcl = []
def getD(url,no):
    html = getHtmlWinthIp(url)

    soup = BeautifulSoup(html,'html.parser')

    #寻找parent
    parent = soup.find(id='content').find('ul')
    #找到所有的li

    lis = parent.find_all('li',limit=no)

    #新建列表存储所有的src

    for each in lis:
        #each.find('img').attrs这是所有img的属性组成的字典
        src = each.find('img').attrs['src']#读取字典的src
        imgsrcl.append(src)#添加到总的列表


    # os.mkdir()#创建文件夹
    # os.chdir()#改变文件路径
    # os.path.exists()#判断是否已经存在某文件夹
    
def store():
    if os.path.exists('范冰冰2'):
        os.chdir('范冰冰2')
    else:
        os.mkdir('范冰冰2')
        os.chdir('范冰冰2')
        
    # 存储
    for i, v in enumerate(imgsrcl):
        request.urlretrieve(v, str(i + 1) + '.jpg')
def main(n):
    for index in range(30,n+31,30) :
        url = 'https://movie.douban.com/celebrity/1050059/photos/?type=C&start='+str(index-30)+'&sortby=like&size=a&subtype=a'
        print("正在爬取第" + str(index//30) + "页")
        if  n%index!=0:
            no=n%index
            getD(url,no)
        else:
            getD(url,30)

    store()


#传入多少就爬取多少张
if __name__ == '__main__':
    main(48)

猜你喜欢

转载自blog.csdn.net/qq_40243365/article/details/83003257

python 爬虫抓取网站img图片

python网络爬虫抓取网站图片

python网络爬虫抓取图片

python使用requests爬虫抓取美女图片网站图片

python爬虫抓取网站技巧总结

python爬虫抓取网站情况举例

Python爬虫使用selenium抓取网站数据

实战：如何通过python requests库写一个抓取小网站图片的小爬虫

python--爬虫--积累--多图片网站抓取加速方案和调优记录

python 爬虫 3 （实例：爬取网站照片、一句代码抓取图片）

Python3爬虫图片抓取

Python爬虫 —— 抓取美女图片

Python爬虫之网页图片抓取

Python爬虫之gif图片抓取

python爬虫-- 抓取网页、图片、文章

【爬虫】使用magical抓取某个网站的图片

python爬虫：淘宝图片爬虫

python爬虫：淘宝图片爬虫

Python爬虫——爬取网站的图片

python爬虫-爬取网站图片。

Python爬虫爬取网站上的图片

python爬虫下载图片--艺术网站

Python爬虫实战：批量下载网站图片

python爬虫爬取网站图片

python爬虫之爬取网站图片

Python爬虫抓取纯静态网站及其资源

用python爬虫抓取视频网站所有电影

[Python][爬虫03]requests+BeautifulSoup实例:抓取图片并保存

[Python练手爬虫]煎蛋网抓取图片

python爬虫之抓取网页中的图片到本地

今日推荐

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

周排行

rbac——界面、权限

Apache CXF + SpringMVC 整合发布WebService

so插件化

Vue.js实战系列---图标字体制作（svg格式）

PAT乙级 1007 素数对猜想(孪生素数对) (20分) ---（C语言 + 详细注释）

被IRM保护的文档，打开失败

Calendar和Date计算日期差的小问题

win10子系统ubuntu18.4安装docker

利用Wrap Shell Script定位Android Native内存泄漏

MySQL: Transaction (Part I - Basic Concept)

每日归档

更多

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)