xpath–Simple reptile example–Extract the original wallpaper of Onmyoji
Article Directory
I. Introduction
Many people have played with Onmyoji, aside from anything else, the original paintings made by Onmyoji are quite detailed. In my spare time, I can crawl them down with a few simple lines of code. Isn’t it beautiful?
Second, the libraries that need to be used
import requests
from lxml import etree
from fake_useragent import UserAgent
import os
For those who don’t use the installation library, you can take a look at this article I wrote before. There are many links to domestic sources for your download.
Three, the realization process
1. Analyze web pages
First open the official website, the official website portal , click on the " original wallpaper " in the " audiovisual center "
After entering the original wallpaper page, select a wallpaper and check it.
I found that there are different links for different resolutions, and the image I checked has six resolutions. Is this the same for all images?
Then I found out, it's not!
As shown above, there is even a picture with only four resolutions, and the resolution position of each picture is not consistent. How to extract the original picture link?
A: Use xpath to extract nodes based on text content
a = lists[i].xpath('./div/div/a[contains(text(), "1920x1080")]')[0]
In this way, a node with a resolution of "1920x1080" can be extracted.
Q: lists[i]
What is it?
A: You will know after reading the complete code.
2. Complete code implementation
import requests
from lxml import etree
from fake_useragent import UserAgent
import os
path = 'D:/阴阳师'
if not os.path.exists(path):
os.mkdir(path)
# 随机产生请求头
ua = UserAgent(verify_ssl=False, path='fake_useragent.json')
url = 'https://yys.163.com/media/picture.html' # 原画壁纸的页面链接
response = requests.get(url=url).text
html = etree.HTML(response)
lists = html.xpath('/html/body/div[2]/div[3]/div[1]/div[3]/div[2]/div')
num = 1
for i in range(len(lists)):
a = lists[i].xpath('./div/div/a[contains(text(), "1920x1080")]')[0] # 根据文本内容锁定节点a
image_url = a.xpath('./@href')[0] # 获取原画壁纸链接
image_data = requests.get(url=image_url).content
image_name = '{}.jpg'.format(num) # 给每张图片命名
save_path = path + '/' + image_name # 图片的保存地址
with open(save_path, 'wb') as f:
f.write(image_data)
print(image_name, '=======================>下载成功!!!')
f.close()
num += 1
The results of the operation are as follows:
Fourth, composite video
By synthesizing the video, you can slowly appreciate the original painting that has been crawled down, which is extremely comfortable.
code show as below:
import cv2
import os
# 输出视频的保存路径
video_dir = 'D:/yinyangshi/result.mp4'
# 帧率
fps = 0.2
# 图片尺寸
img_size = (1920, 1080)
fourcc = cv2.VideoWriter_fourcc('M', 'P', '4', 'V') # opencv3.0 mp4会有警告但可以播放
videoWriter = cv2.VideoWriter(video_dir, fourcc, fps, img_size)
img_files = os.listdir('D:/yinyangshi/')
for i in range(1, 397):
img_path = 'D:/yinyangshi/tupian/' + '{}.jpg'.format(i)
frame = cv2.imread(img_path)
frame = cv2.resize(frame, img_size) # 生成视频 图片尺寸和设定尺寸相同
videoWriter.write(frame) # 写进视频里
print(f'======== 按照视频顺序第{i}张图片合进视频 ========')
videoWriter.release() # 释放资源
Note: When composing a video, the saving path of the picture and the generating path of the video cannot contain Chinese! ! !
Beep Beep Link : https://www.bilibili.com/video/BV1Kp4y1W7yB
Collection of original paintings by Onmyoji