Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

Preface

The text and pictures in this article are from the Internet and are for learning and communication purposes only, and do not have any commercial use. If you have any questions, please contact us for processing.

Python crawler, data analysis, website development and other case tutorial videos are free to watch online

https://space.bilibili.com/523606542

Preamble content

 

Python crawler beginners introductory teaching (1): crawling Douban movie ranking information

Python crawler novice introductory teaching (2): crawling novels

Python crawler beginners introductory teaching (3): crawling Lianjia second-hand housing data

Python crawler novice introductory teaching (4): crawling 51job.com recruitment information

Python crawler beginners' introductory teaching (5): Crawling the video barrage of station B

Python crawler novice introductory teaching (6): making word cloud diagrams

Python crawler beginners introductory teaching (7): crawling Tencent video barrage

Python crawler novice introductory teaching (8): crawl forum articles and save them as PDF

Python crawler beginners introductory teaching (9): multi-threaded crawler case explanation

Python crawler novice introductory teaching (ten): crawling the other shore 4K ultra-clear wallpaper

Python crawler beginners introductory teaching (11): recent king glory skin crawling

 

Python crawler novice introductory teaching (12): the latest skin crawling of League of Legends

Basic development environment

  • Python 3.6
  • Pycharm

Use of related modules

import requests
import re
import os

Install Python and add it to the environment variables, pip installs the required related modules.

One, clear needs

 

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 


Crawl the HD wallpapers inside as shown

2. Web page data analysis

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 


Click to download the original image, it will automatically download the wallpaper image for you.

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 


So just get this link to crawl the wallpaper image.

When you return to the list, you can find that the web page is loaded in a waterfall flow mode, and data will only appear when you slide down. Therefore, you can open the developer tools before sliding down the webpage, and the newly loaded data will appear when the webpage is scrolled down.

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 

Through the comparison, we can know that this data package contains the address of the wallpaper image download.

Note that this data link is a post request, not a get request

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 


The data parameter that needs to be submitted is the corresponding page number.

Three, code implementation

1. Get the image ID

    for page in range(1, 11):
        url = 'https://wallpaper.wispx.cn/cat/%E5%8A%A8%E6%BC%AB'
        headers = {
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',
            'x-requested-with': 'XMLHttpRequest',
        }
        data = {
            'page': page
        }
        response = requests.post(url=url, headers=headers)
        result = re.findall('detail(.*?)target=', response.text)
        for index in result:
            image_id = index.replace('\\', '').replace('" ', '')
            page_url = f'https://wallpaper.wispx.cn/detail{image_id}'

2. Get the wallpaper url address and save it

def main(page_url):
    html_data = get_response(page_url).text
    image_url = re.findall('<a class="mdui-ripple mdui-ripple-white" href="(.*?)">', html_data)[0]
    image_title = re.findall('<title>(.*?)</title>', html_data)[0].split(' - ')[0]
    image_content = get_response(image_url).content
    path = 'images\\'
    if not os.path.exists(path):
        os.makedirs(path)
    with open(path + image_title + '.jpg', mode='wb') as f:
        f.write(image_content)
        print('正在保存:', image_title)

Points to note:

The request header must be anti-leech, otherwise it will not be downloaded.

def get_response(html_url):
    header = {
        'referer': 'https://wallpaper.wispx.cn/detail/1206',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
    }
    resp = requests.get(url=html_url, headers=header)
    return resp

Fourth, achieve the effect

Python crawler beginners introductory teaching (13): crawling high-quality ultra-clear wallpapers

 

Guess you like

Origin blog.csdn.net/m0_48405781/article/details/113619996
Recommended