Use python to save thousands of emoticons with one click, and conquer all friends in the circle of friends in minutes

Nowadays, when young people chat, they are embarrassed to say that they are young people without some emojis. Emojis have become an indispensable part of people-to-people chat.

A friend I just met throws a few emojis out and gets into the relationship every minute. My girlfriend is sullen and happy with the two emojis. It can also resolve the embarrassment. I don’t have time to type the whole two emojis.

Life is too short, I use python

One, want to promote first

Preparation is very important. First, we need to know what we are going to do, what to do with it, and how to do it, and then go step by step in real time and play steadily.

Development environment configuration

Python 3.6
Pycharm

Open your browser and search for the name of the software you want to install

Python

The official website is the official website. If there is an advertisement under the name, don't click on it. Be confident, it is an advertisement.

Just click Python 3.10.2 below to download the latest version, no need to click Download
insert image description here

pycharm


Just click on a Download
insert image description hereProfessional Edition Community Edition is OK
insert image description here

The installation method is too long to talk about one by one, you can scan the code at the bottom of the article to have a video

Module installation configuration

requests
parsel
re

Turn on the computer, press and hold win+r, enter cmd, press Enter, enter pip install (plus the name of the module to be installed), press Enter to install.

2. Code

Goal: Let everyone complete the front and back of the fabiaoqing
address, including the code in the back, there should be no problem.

import module

import requests 
import parsel 
import re
import time

request url

url = f'fabiaoqing/biaoqing/lists/page/{page}.html'

request header

headers = {
    
    
       'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
    }

Return to the source code of the webpage

response = requests.get(url=url, headers=headers)

Analytical data

selector = parsel.Selector(response.text) # 把respons.text 转换成 selector 对象

The first extraction extracts all the div tag content

divs = selector.css('#container div.tagbqppdiv') # css 根据标签提取内容

Extract his image url address from the tag content

img_url = div.css('img::attr(data-original)').get()

extract title

title = div.css('img::attr(title)').get()

Get the suffix name of the image

name = img_url.split('.')[-1]

save data

new_title = change_title(title)

Send a request to the emoji image to get its binary data

img_content = requests.get(url=img_url, headers=headers).content

save data

def save(title, img_url, name):

    img_content = get_response(img_url).content
    try:
        with open('img\\' + title + '.' + name, mode='wb') as f:
            # 写入图片二进制数据
            f.write(img_content)
            print('正在保存:', title)
    except:
        pass

Replace special characters in title

Because the file name is unknown and there are special characters, we need to replace the special characters with regular expressions.

def change_title(title):
    mode = re.compile(r'[\\\/\:\*\?\"\<\>\|]')
    new_title = re.sub(mode, "_", title)
    return new_title

record time

time_2 = time.time()

use_time = int(time_2) - int(time_1)
print(f'总共耗时:{use_time}秒')

Brothers, here is a single thread, the following is a multi-thread, I will go directly to the code.

import requests  
import parsel 
import re
import time
import concurrent.futures 



def change_title(title):

    mode = re.compile(r'[\\\/\:\*\?\"\<\>\|]')
    new_title = re.sub(mode, "_", title)
    return new_title


def get_response(html_url):

    headers = {
    
    
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
    }
    repsonse = requests.get(url=html_url, headers=headers)
    return repsonse


def save(title, img_url, name):

    img_content = get_response(img_url).content
    try:
        with open('img\\' + title + '.' + name, mode='wb') as f:
          
            f.write(img_content)
            print('正在保存:', title)
    except:
        pass


def main(html_url):

    html_data = get_response(html_url).text
    selector = parsel.Selector(html_data) 
    divs = selector.css('#container div.tagbqppdiv') 
    for div in divs:

        img_url = div.css('img::attr(data-original)').get()
 
        title = div.css('img::attr(title)').get()

        name = img_url.split('.')[-1]
 
        new_title = change_title(title)
        save(new_title, img_url, name)


if __name__ == '__main__':
    time_1 = time.time()
    exe = concurrent.futures.ThreadPoolExecutor(max_workers=10)
    for page in range(1, 201):
        url = f'fabiaoqing/biaoqing/lists/page/{page}.html'
        exe.submit(main, url)
    exe.shutdown()
    time_2 = time.time()
    use_time = int(time_2) - int(time_1)
    print(f'总共耗时:{use_time}秒')

Brothers, there are more than 1,000 pictures in 18 seconds. This is a bit too fast to end.

If you find it useful after reading it, please like and save it. I love you and touch it. You can see that the code runs so fast. So fast, not good~

Guess you like

Origin blog.csdn.net/fei347795790/article/details/123386617