Python gets photos, realizes face value detection, and ranks~

foreword

Hello! Hello everyone, this is the Demon King~

Ready to work

Before writing code, you need to apply for permission on the developer platform.
I don't know how to make it possible to chat with me privately~

Development environment:

  • Python 3.8
  • Pycharm 2021.2
  • will use the interface

Module use:

  • requests >>> pip install requests
  • tqdm >>> pip install tqdm
  • os
  • base64

How to install python third-party modules:

  1. win + R Enter cmd Click OK, enter the installation command pip install module name (pip install requests) Enter
  2. Click Terminal in pycharm to enter the installation command

How to configure the python interpreter in pycharm?

  1. Select file >>> setting >>> Project >>> python interpreter (python interpreter)
  2. Click on the gear, select add
  3. Add python installation path

How does pycharm install plugins?

  1. Select file >>> setting >>> Plugins
  2. Click on Marketplace and enter the name of the plug-in you want to install. For example: translation plug-in input translation / Chinese plug-in input Chinese
  3. Select the corresponding plug-in and click install.
  4. After the installation is successful, the option to restart pycharm will pop up, click OK, and restart to take effect.

in two stages

The first stage is to collect photo data

The basic process of crawler:

1. Data source analysis

Where can the anchor photo and photo url address be obtained?>> Packet capture analysis through the developer tool on the web page

2. Code implementation steps Send request >>> get data >>> parse data >>> save data

  1. send request, for send request list page
  2. Get data, get the response data response returned by the server
  3. Parse the data, extract the content we want, anchor name, anchor cover image, url address,
    json data, can directly process
    re regular expression extraction,
    extract content according to tag node/attribute:
    xpath
    css selector
  4. Save data, save image content to local folder

The second stage of color value detection

For our saved photos, we will perform color value detection and scoring
insert image description here

code

For some reason, the code is incomplete, if you need it, you can privately chat with me and I will send it to you~

# 导入数据请求
import requests  # pip install requests (导入模块没有使用是灰色)
# 导入格式化输出模块
import pprint
# 导入os文件操作模块
import os
import base64
from tqdm import tqdm

# 发送请求
# 确定请求的url地址
for page in range(1, 11):
    url = f''
    # headers 请求头 伪装Python的代码 不被识别出来是爬虫程序...
    # headers 是一个字典数据类型
    headers = {
    
    
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'
    }
    # 通过requests模块去对url地址发送请求
    response = requests.get(url=url, headers=headers)
    # 2. 获取数据, 获取服务器返回的数据内容
    # <Response [200]> 返回response对象, 200状态码 表示请求成功
    # 我能不能直接获取json数据呢? *** 请求参数里面含有callback 想要直接获取json数据 要删掉
    # print(response.json())
    # pprint.pprint(response.json())
    # 3. 解析数据, 提取我们想要数据内容 主播名字 主播封面图url地址
    # json数据提取内容 根据冒号左边的内容 提取冒号右边内容
    data_list = response.json()['data']['datas']
    for index in data_list:
        # pprint.pprint(index)
        name = index['nick']
        img_url = index['screenshot']
        # 4. 保存数据 保存图片数据内容, 也需要发送请求 获取数据
        # response.text 获取响应体文本数据
        # response.json() 获取响应体的json字典数据
        # response.content 获取响应体的二进制数据
        img_content = requests.get(url=img_url, headers=headers).content
        # 'img\\' 文件路径 name 文件名字 '.jpg' 文件后缀 >>> 文件名
        # mode 保存方式 wb 二进制模式写入
        # as 重命名 为 f
        filename = 'img_1\\'
        if not os.path.exists(filename):
            os.mkdir(filename)

        with open(filename + name + '.jpg', mode='wb') as f:
            f.write(img_content) # 写入数据
            print('正在保存: ', name)


def get_beauty(img_base64):
    host = ''
    data = {
    
    
        'grant_type': 'client_credentials',
        'client_id': 'vXONiwhiVGlBaI2nRRIYLgz5',
        'client_secret': 'ouZMTMuCGLi7pbeg734ftNxn9h3qN7R4'
    }
    response = requests.get(url=host, params=data)
    token = response.json()['access_token']
    # print(token)
    '''
    人脸检测与属性分析
    '''
    request_url = f"}"
    params = {
    
    
        "image": img_base64,  # 需要传递 图片 base64
        "image_type": "BASE64",
        "face_field": "beauty"
    }
    headers = {
    
    'content-type': 'application/json'}
    response = requests.post(request_url, data=params, headers=headers)
    try:
        beauty = response.json()['result']['face_list'][0]['beauty']
        return beauty
    except:
        return '识别失败'


# f = open('img\\DX丶软软.jpg', mode='rb')  # 读取一张图片内容
# 转成base64内容
# img_base64 = base64.b64encode(f.read())


# 1. 获取所有图片
lis = []
files = os.listdir('img_1\\')
print('正在识别人脸, 颜值检测中, 请稍后.....')
for file in tqdm(files):
    img_file = 'img_1\\' + file
    img_name = file.split('.')[0]
    # print(img_file)
    f = open(img_file, mode='rb')  # 读取一张图片内容
    img_base64 = base64.b64encode(f.read())
    beauty = get_beauty(img_base64)
    if beauty != '识别失败':
        dit = {
    
    
            '主播': img_name,
            '颜值': beauty,
        }
        lis.append(dit) # 把字典添加到空列表里面
    # print(f'{img_name}颜值评分是{beauty}')


lis.sort(key=lambda x:x['颜值'], reverse=True)
num = 1
# 前10张照片的颜值排名
for index in lis:
    print(f'颜值排名第{
    
    num}的是{
    
    index["主播"]}, 颜值评分是{
    
    index["颜值"]}')
    num += 1

insert image description here
insert image description here

video tutorial

Python crawler + artificial intelligence: collect Huya anchor cover and make face value detection

epilogue

Well, this article of mine ends here!

If you have more suggestions or questions, feel free to comment or private message me! Let's work hard together (ง •_•)ง

Follow the blogger if you like it, or like and comment on my article! ! !
insert image description here

Guess you like

Origin blog.csdn.net/python56123/article/details/124041503