Python reptile from entry to give up 08 | Python reptile combat - Download Heroes League All-skin

This blog article is only used my amateur record, publish this, only users to read reference, if infringement, please inform me and I will be deleted.
This article is pure wild, without any plagiarism and other articles and learn from others. Adhere to the original! !

Foreword

Hello there. Here is a Python reptile from the entry to abandon a series of articles. I SunriseCai.


This article describes the use of crawlers download all the heroes of the League of skin.

League Heroes Gallery: https://lol.qq.com/data/info-heros.shtml

1. Article Ideas

League of Legends website look, multi-map as follows:

  • Home (one page)
    Here Insert Picture Description
  • Skin page (two pages)
    Here Insert Picture Description
  • Pictures (three pages)
    Here Insert Picture Description

It can be seen above a few pictures, which set down is still a Russian doll! ! !

  1. Access home page (a page) to get all heroes links (two pages)
  2. Access hero Link (two pages) to obtain image link (three pages)
  3. Access image link (three pages) , save the image.Here Insert Picture Description

So, the next is to be implemented in code download pictures.

2. Request + page analysis

Added to the above, this article's request to be home for https://lol.qq.com/data/info-heros.shtml .

2.1 Request Home

浏览器打开 网站首页,点击F12,进入开发者模式。看看页面结构,发现了二级页面的链接就在<li>标签里面。perfect !!!那接下来就是去 请求网页
Here Insert Picture Description
首页请求代码:

import requests

url = 'https://lol.qq.com/data/info-heros.shtml'
headers = {
    'User-Agent': 'Mozilla/5.0'
}
res = requests.get(url, headers=headers)
if res.status_code == 200:
    print(res.text)
else:
    print('your code is fail')

执行上述代码之后,发现并没有上图中的<li>标签的内容,这是怎么回事呢?<li>标签的内容极有可能是通过xhr异步加载出来的的文件,咱们来抓包看看!!

  • 再次请求首页时候发现,在xhr这里,有一个hero_list.js文件,翻译过来就是英雄列表

  • 看到hero_list.jsurl为 :https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js
    Here Insert Picture Description

  • 点击之后,发现这正是我们需要的内容!!!
    Here Insert Picture Description
    下面请求hero_list.js

请求的代码很简单,只需要将上面的代码的url更改为 https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js 即可。

未完待续。。。


不可否认,本篇文章写的还过得去,建议各位通过复制黏贴代码去执行一番,重温南派三叔的魅力。


最后来总结一下本章的内容:

  1. Introduced the Tomb reptile ideas website
  2. Explain in detail how to use crawler to download the entire network fiction
  3. Very detailed, have any questions please leave a comment below.

sunrisecai

  • Thank you for your patience to watch, point concerns not get lost.
  • To facilitate the pecking chicken dishes are welcome to join QQ group organization: 648 696 280

The next article, titled "Python reptile from entry to give up 09 | Python reptile combat - download a picture network - to be determined" .

Published 43 original articles · won praise 310 · views 50000 +

Guess you like

Origin blog.csdn.net/weixin_45081575/article/details/104085897