Python practiced hand items: 20 full-line crawling hero king of the whole skin

introduction

   King of glory all played it, never played should be heard, as among the most fire phone MOBA games, Keke, seems beside the point. Our focus today is the king of glory for all skin crawl all heroes, but using only 20 lines of Python code to complete.
   At the end of the source code text articles that can copy itself paste.

Ready to work

   Crawling skin itself is not difficult, the difficulty lies in the analysis, we first have to get skin picture url address, did not talk much, we immediately came to the glory of the King's official website:

   Our hero click data, then choose a hero at random, then F12 to open the debugging stage, find the hero of the original skin Photo Address:
image.png

   Then, we switch about a hero's skin, you will find pictures address did not change significantly, but the final figure number has changed, we will address two skin pictures put them together here:

http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/523/523-bigskin-1.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/523/523-bigskin-2.jpg

   We can guess, a hero for the same skin picture address, just different last digit serial number, in order to confirm our conjecture, we can continue to find a heroic picture of the whole skin, look for a little more skin, for example, I find here is Shangxiang, to address all of its skin picture put them together:

http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-1.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-2.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-3.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-4.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-5.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-6.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-7.jpg

   Thus, we conclude with a hero's skin image path ascending order starting from 1, we look at how to distinguish between the different heroes. You will find that no matter how the picture changes in the skin, the top of the browser address is always the same, so we will url address two different heroes put them together here:

https://pvp.qq.com/web201605/herodetail/523.shtml
https://pvp.qq.com/web201605/herodetail/111.shtml

   At first glance, it seems that there is no law, but from here we find the point that the final figure actually control what a hero is, we forget that it is the hero of the number, but unfortunately, it seems there is no law between the hero number , do not worry, we then look for clues official website.

In the heroic data interface, we open the F12 debugging stage, crawling through a network request, I found several files:
image.png

   点击网络,然后点击XHR,就可以看到这几个文件,看到文件的名字大家应该就清楚了,这些文件存储的就是英雄列表信息,我们点击查看一下:
在这里插入图片描述
   没错,这里存储的就是英雄信息,包括英雄的名字,英雄编号等等其它信息,我们可以试试这些信息的准确性,例如小乔的ename,也就是英雄编号为106,所以按照之前的想法,英雄小乔的详情地址应为:https://pvp.qq.com/web201605/herodetail/106.shtml
经过尝试后发现确实如此。

   到这里,准备工作就完成了,其实进行到这里,整个工程就完成了一半了,接下来就是代码的实现了。

代码实现

   首先我们创建一个Python文件,然后导入osrequests模块。
按照前面的步骤,我们首先需要获取到英雄列表信息,也就是herolist.json文件,文件地址为:https://pvp.qq.com/web201605/js/herolist.json,这在调试台中可以找到
那么我们首先就要通过这个地址获取到英雄列表信息的json数据,然后解析json数据,将有用的信息提取出来:

url = 'https://pvp.qq.com/web201605/js/herolist.json'
herolist = requests.get(url)  # 获取英雄列表json文件

herolist_json = herolist.json()  # 转化为json格式
hero_name = list(map(lambda x: x['cname'], herolist.json()))  # 提取英雄的名字
hero_number = list(map(lambda x: x['ename'], herolist.json()))  # 提取英雄的编号

   So we get to the heroic names and numbers, you can output test:
After the hero got the numbers, things become very simple, just click stitching url address to:
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/' + hero_number + '/' + hero_number + '-bigskin-1.jpg, so you can get to all the heroes of the skin picture, but there is a problem here, the hero of the skin is more or less, and some heroes are only two of the skin, there are some sixty-seven, so the maximum number of pictures we do not know, here I used a comparison stupid way, is to make a variable from 1-10 in ascending order of address to the entire image, if they do not we will not deal with the picture, because there is a hero of the skin over 10, so we can get to all of the pictures a. The following look at the code to achieve:

# 下载图片
def downloadPic():
    i = 0
    for j in hero_number:
        # 创建文件夹
        os.mkdir("C:\\Users\\Administrator\\Desktop\\wzry\\" + hero_name[i])
        # 进入创建好的文件夹
        os.chdir("C:\\Users\\Administrator\\Desktop\\wzry\\" + hero_name[i])
        i += 1
        for k in range(10):
            # 拼接url
            onehero_link = 'http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/' + str(j) + '/' + str(
                j) + '-bigskin-' + str(k) + '.jpg'
            im = requests.get(onehero_link)  # 请求url
            if im.status_code == 200:
                open(str(k) + '.jpg', 'wb').write(im.content)  # 写入文件

   Achieve a very simple, code comments have been written very clearly, the arrival of this function, we simply call it, pictures can be downloaded, the complete code for the entire program are as follows:

import os
import requests

# python0基础小白加群:456926667,获取更多的python练手项目、练习,以及学习交流。

url = 'https://pvp.qq.com/web201605/js/herolist.json'
herolist = requests.get(url)  # 获取英雄列表json文件

herolist_json = herolist.json()  # 转化为json格式
hero_name = list(map(lambda x: x['cname'], herolist.json()))  # 提取英雄的名字
hero_number = list(map(lambda x: x['ename'], herolist.json()))  # 提取英雄的编号


# 下载图片
def downloadPic():
    i = 0
    for j in hero_number:
        # 创建文件夹
        os.mkdir("C:\\Users\\Administrator\\Desktop\\wzry\\" + hero_name[i])
        # 进入创建好的文件夹
        os.chdir("C:\\Users\\Administrator\\Desktop\\wzry\\" + hero_name[i])
        i += 1
        for k in range(10):
            # 拼接url
            onehero_link = 'http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/' + str(j) + '/' + str(
                j) + '-bigskin-' + str(k) + '.jpg'
            im = requests.get(onehero_link)  # 请求url
            if im.status_code == 200:
                open(str(k) + '.jpg', 'wb').write(im.content)  # 写入文件


downloadPic()

   Remove the comment, approaching 20 lines of code we've had the full glory of the King crawling hero of the skin, is not very simple? We can test this program, you must first create a folder on your desktop called wzry, because here's the code I've written is dead, if you want to modify, then we can also be modified, click the folder is created to run after wait a few moments, all the pictures on the download is complete.
image.png

image.png

   For the program parses json string, we can also use jsonpath module to use this module can more quickly get to the information we want, resolved as follows:

hero_name = jsonpath.jsonpath(html_json, "$..cname")
hero_number = jsonpath.jsonpath(html_json, "$..ename")

   The method receives a json string and parsing rules, $ ... cname from said root directory to find anywhere cname key value and placed in the dictionary.
end

   Reptile is very interesting, because it is very intuitive and strong sense of visual impact, it is also very rewarding to write, reptiles, although powerful, but do not arbitrarily crawling private information.

   Finally, if you have a better suggestion for the text program, welcomed the comments section message.

Guess you like

Origin www.cnblogs.com/BigBears/p/11966362.html