Coroutine download hero League character skin
1 Introduction
Little Diaosi : Brother Yu, the Chinese New Year is coming, lol can't fix it!
Xiaoyu : No, I want to learn...
Xiao Diaosi : Are you telling me a joke about the end of the Year of the Ox?
Xiaoyu : The end? ? Hey~ Thanks, bro, you reminded me!
Xiao Diaosi : ... I can thank me for this too, meow...
The Chinese New Year is just around the corner. In the final battle of the Year of the Ox, we have pleasant and happy thing: download the picture of the fairy godmother and use it as a screensaver.
2. Code combat
2.1 Webpage Analysis
Ideas :
- 1. First log in to the official website of lol and query the url known as a hero
- 2. Check the url of each hero to find out the pattern
It's as simple as that.
First, let's log in to the official lol website and check the url addresses of all heroes:
As you can see, the hero's list url is hero_list.js
https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js?ts=2739020
2. View the url address of each hero
Sona
https://game.gtimg.cn/images/lol/act/img/js/hero/37.js?ts=2739020
Leona
https://game.gtimg.cn/images/lol/act/img/js/hero/89.js?ts=2739021
Therefore, we can conclude that each hero is spliced by heroId .
2.2 Code combat
2.2.1 Module installation
Since there are many heroes in lol
, it will take us a long time to download all the heroes in a single thread.
Xiao Diaosi : Brother Yu, let's go with multiple threads!
Xiaoyu : Multithreading has gone home for the Chinese New Year. Today we will change the way.
Xiao Diaosi : The people in the city really know how to play. Today I will change~~ Who ~~Which way? ?
Xiaoyu : Coroutines.
Little Diaosi : Oh hey, this is ok, it's fresh.
Xiaoyu : It's New Year's, so I have to change my taste.
It's too far~ I really see that the high-speed is free, and I'm going to drag the car.
Module installation
pip install gevnet
Other ways to install :
" Python3, choose Python to automatically install third-party libraries, and say goodbye to pip! ! "
" Python3: I only use one line of code to import all Python libraries! ! 》
2.2.2 Difference between process, coroutine and thread
the difference:
- The process is the unit of resource allocation, the thread that actually executes the code, and the thread that the operating system really schedules
- A thread is the unit of operating system scheduling
- Process switching takes up a lot of resources, no threads are efficient, processes take up a lot of resources, threads take up less resources, and coroutines are less than threads
- The coroutine depends on the thread, the thread depends on the process, the process dies, the thread dies, the thread dies, the coroutine also dies
- Generally, there is no need for multiple processes, and more threads are used. If there are many network requests in the threads, the network may be blocked. At this time, it is more appropriate to use coroutines.
- Multi-process and multi-threading may be parallel depending on the number of cpu cores, but the coroutine is in one thread, so it is concurrent
2.2.3 Code Examples
code example
# -*- coding:utf-8 -*-
# @Time : 2022-01-29
# @Author : carl_DJ
import gevent
from gevent import monkey
import requests ,os,re
import datetime
'''
下载英雄联盟各个人物的皮肤
'''
#自动捕捉阻塞情况
monkey.patch_all()
#设置header
header = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
}
#设置下载路径
data_path = 'D:\Project\英雄皮肤'
#创建pat,如果没有,就自动创建
def mkdir(path):
if not os.path.exists(path):
os.mkdir(path)
#爬取内容设定
def crawling():
start_time = datetime.datetime.now()
print(f'开始执行时间:{
start_time}')
#爬取url
url = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js'
#响应内容
response = requests.get(url=url,headers=header)
heros = response.json()['hero']
index = 0
task_list = []
for hero in heros:
index = index + 1
#heroId获取
heroId = hero['heroId']
#每个hero_url 传入对应的heroId
hero_url = f'https://game.gtimg.cn/images/lol/act/img/js/hero/{
heroId}.js'
hero_resp = requests.get(url = hero_url,headers=header)
skins = hero_resp.json()['skins']
#将get_pic,skins 设置为协程,实现并发执行
task = gevent.spawn(get_pic,skins)
task_list.append(task)
if len(task_list) == 10 or len(skins) == index:
#开启协程
gevent.joinall(task_list)
task_list = []
end_time = datetime.datetime.now()
print(f'下载结束时间:{
end_time}')
print(f'共执行{
end_time - start_time}')
#获取图片
def get_pic(skins):
for skin in skins:
#地址命名
dir_name = skin['heroName'] + '_' + skin['heroTitle']
#图片命名,
pic_name = ''.join(skin['name'].split(skin['heroTitle'])).strip();
url = skin['mainImg']
if not url:
continue
invalid_chars = '[\\\/:*?"<>|]'
pic_name = re.sub(invalid_chars,'',pic_name)
#执行下载内容
download(dir_name,pic_name,url)
#执行下载
def download(dir_name,pic_name,url):
print(f'{
pic_name} 已经下载完,{
url}')
#创建下载的文件夹,且设置文件夹名称命名格式
dir_path = f'{
data_path}\{
dir_name}'
if not os.path.exists(dir_path):
os.mkdir(dir_path)
#爬取url
resp = requests.get(url,headers=header)
#下载图片写入文件夹
with open(f'{
dir_path}\{
pic_name}.png', 'wb') as f:
f.write(resp.content)
print(f'{
pic_name} 下载完成')
# finish_time = datetime.datetime.now()
# print(f'下载完成时间:{finish_time}')
if __name__ == '__main__':
mkdir(data_path)
crawling()
Results of the
Zoom in and see the goddess.
3. Summary
See here, today's sharing is here.
Today , images are downloaded in batches mainly through coroutines .
Regarding the use of gevent, there is not much introduction in this blog post,
but this is the routine of Xiaoyu.
Because Xiaoyu will write a special article on the difference between coroutines, threads, and processes , to ensure that after reading it, you will understand it properly.