Python3, for "Sona", I spent 3 minutes to download all the heroes lol.

1 Introduction

Little Diaosi : Brother Yu, the Chinese New Year is coming, lol can't fix it!
Xiaoyu : No, I want to learn...
insert image description here

Xiao Diaosi : Are you telling me a joke about the end of the Year of the Ox?
Xiaoyu : The end? ? Hey~ Thanks, bro, you reminded me!
Xiao Diaosi : ... I can thank me for this too, meow...

The Chinese New Year is just around the corner. In the final battle of the Year of the Ox, we have pleasant and happy thing: download the picture of the fairy godmother and use it as a screensaver.

2. Code combat

2.1 Webpage Analysis

Ideas :

  • 1. First log in to the official website of lol and query the url known as a hero
  • 2. Check the url of each hero to find out the pattern

It's as simple as that.
First, let's log in to the official lol website and check the url addresses of all heroes:

insert image description here
As you can see, the hero's list url is hero_list.js

https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js?ts=2739020

2. View the url address of each hero

Sona
insert image description here

https://game.gtimg.cn/images/lol/act/img/js/hero/37.js?ts=2739020

Leona
insert image description here

https://game.gtimg.cn/images/lol/act/img/js/hero/89.js?ts=2739021

Therefore, we can conclude that each hero is spliced ​​by heroId .

2.2 Code combat

2.2.1 Module installation

Since there are many heroes in lol
, it will take us a long time to download all the heroes in a single thread.

Xiao Diaosi : Brother Yu, let's go with multiple threads!
Xiaoyu : Multithreading has gone home for the Chinese New Year. Today we will change the way.
Xiao Diaosi : The people in the city really know how to play. Today I will change~~ Who ~~Which way? ?
Xiaoyu : Coroutines.
Little Diaosi : Oh hey, this is ok, it's fresh.
Xiaoyu : It's New Year's, so I have to change my taste.

It's too far~ I really see that the high-speed is free, and I'm going to drag the car.

Module installation

pip install gevnet

Other ways to install :

" Python3, choose Python to automatically install third-party libraries, and say goodbye to pip! ! "
" Python3: I only use one line of code to import all Python libraries! !

2.2.2 Difference between process, coroutine and thread

the difference:

  • The process is the unit of resource allocation, the thread that actually executes the code, and the thread that the operating system really schedules
  • A thread is the unit of operating system scheduling
  • Process switching takes up a lot of resources, no threads are efficient, processes take up a lot of resources, threads take up less resources, and coroutines are less than threads
  • The coroutine depends on the thread, the thread depends on the process, the process dies, the thread dies, the thread dies, the coroutine also dies
  • Generally, there is no need for multiple processes, and more threads are used. If there are many network requests in the threads, the network may be blocked. At this time, it is more appropriate to use coroutines.
  • Multi-process and multi-threading may be parallel depending on the number of cpu cores, but the coroutine is in one thread, so it is concurrent

2.2.3 Code Examples

code example

# -*- coding:utf-8 -*-
# @Time   : 2022-01-29
# @Author : carl_DJ

import gevent
from gevent import monkey
import requests ,os,re
import datetime
'''
下载英雄联盟各个人物的皮肤

'''
#自动捕捉阻塞情况
monkey.patch_all()

#设置header
header = {
    
    
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
}
#设置下载路径
data_path = 'D:\Project\英雄皮肤'

#创建pat,如果没有,就自动创建
def mkdir(path):
    if not os.path.exists(path):
        os.mkdir(path)

#爬取内容设定
def crawling():
    start_time = datetime.datetime.now()
    print(f'开始执行时间:{
      
      start_time}')
    #爬取url
    url = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js'
    #响应内容
    response = requests.get(url=url,headers=header)
    heros = response.json()['hero']

    index = 0
    task_list  = []
    for hero in heros:
        index = index + 1
        #heroId获取
        heroId = hero['heroId']
        #每个hero_url 传入对应的heroId
        hero_url = f'https://game.gtimg.cn/images/lol/act/img/js/hero/{
      
      heroId}.js'
        hero_resp = requests.get(url = hero_url,headers=header)
        skins = hero_resp.json()['skins']
        #将get_pic,skins 设置为协程,实现并发执行
        task  = gevent.spawn(get_pic,skins)
        task_list.append(task)
        if len(task_list) == 10 or len(skins) == index:
            #开启协程
            gevent.joinall(task_list)
            task_list = []
    end_time = datetime.datetime.now()
    print(f'下载结束时间:{
      
      end_time}')
    print(f'共执行{
      
      end_time - start_time}')

#获取图片
def get_pic(skins):
    for skin in skins:
        #地址命名
        dir_name = skin['heroName'] + '_' +  skin['heroTitle']
        #图片命名,
        pic_name = ''.join(skin['name'].split(skin['heroTitle'])).strip();
        url = skin['mainImg']

        if not url:
            continue
        invalid_chars = '[\\\/:*?"<>|]'
        pic_name = re.sub(invalid_chars,'',pic_name)
        #执行下载内容
        download(dir_name,pic_name,url)

#执行下载
def download(dir_name,pic_name,url):

    print(f'{
      
      pic_name} 已经下载完,{
      
      url}')
    #创建下载的文件夹,且设置文件夹名称命名格式
    dir_path  = f'{
      
      data_path}\{
      
      dir_name}'
    if not  os.path.exists(dir_path):
        os.mkdir(dir_path)

    #爬取url
    resp = requests.get(url,headers=header)
    #下载图片写入文件夹
    with open(f'{
      
      dir_path}\{
      
      pic_name}.png', 'wb') as f:
        f.write(resp.content)

    print(f'{
      
      pic_name} 下载完成')
    # finish_time = datetime.datetime.now()
    # print(f'下载完成时间:{finish_time}')


if __name__ == '__main__':
    mkdir(data_path)
    crawling()


Results of the
insert image description here

Zoom in and see the goddess.
insert image description here

3. Summary

See here, today's sharing is here.
Today , images are downloaded in batches mainly through coroutines .
Regarding the use of gevent, there is not much introduction in this blog post,
but this is the routine of Xiaoyu.
Because Xiaoyu will write a special article on the difference between coroutines, threads, and processes , to ensure that after reading it, you will understand it properly.

Guess you like

Origin blog.csdn.net/wuyoudeyuer/article/details/122739041