Python implements face recognition (face_recognition)

1. Definition

1 Introduction

This project is the world's most powerful and concise face recognition library. You can use Python and command line tools to extract, recognize, and operate faces.
The face recognition in this project is based on the deep learning model in the industry's leading C++ open source library dlib. It was tested with the Labeled Faces in the Wild face data set and achieved an accuracy of up to 99.38%. However, the recognition accuracy of children and Asian faces needs to be improved.
Labeled Faces in the Wild is a face data set produced by the University of Massachusetts Amherst. The data set contains more than 13,000 facial images collected from the Internet.

github and official website URL:

https://github.com/ageitgey/face_recognition/blob/master/README_Simplified_Chinese.md
https://face-recognition.readthedocs.io/en/latest/face_recognition.html

2. Face recognition steps:

1) Face detection

To recognize faces, you first need to find the locations of all faces in images or video frames and cut out the face parts of the image.

Histogram of Oriented Gradients (HOG) can be used to detect face location. The image is first grayscaled, because color does not play a significant role in finding the position of the face, and then the gradient of each pixel in the image is calculated.

By transforming the image into HOG form, we can extract the features of the image and obtain the face position.

2) Face alignment

A face in an image may be tilted, or simply in profile. In order to easily encode faces, faces need to be aligned into the same standard shape.


The first step in face alignment is to estimate the feature points of the face. Dlib has specialized functions and models that can locate 68 feature points of a human face.


After finding the feature points, you can align each feature point (move the eyes, mouth, etc. to the same position) through geometric transformation of the image (affine, rotation, scaling).

3) Face coding


Train a neural network to generate 128-dimensional predictions from input facial images.
The general process of training is: feed two different photos of the same person and a photo of another person into the neural network, and continuously iteratively train, so that the predicted values ​​after coding of the two photos of the same person are close, and the predicted values ​​of the photos of different people are smaller. Far. That is to say, the intra-class distance is reduced and the inter-class distance is increased. For specific algorithms, please refer to facenet[3].

4) Identification

Put everyone's faces into the face database in advance, and use the above-mentioned neural network to encode them into 128 dimensions and save them. During recognition, the face is predicted as a 128-dimensional vector and compared with the data in the face database.

There are many comparison methods. You can directly find the face with the smallest Euclidean distance within the threshold range, or train a terminal SVM or knn classifier to directly generate the person's code name (identity).
For the knn classifier construction method, please refer to this code .

The overall code for using Python to implement face recognition can be found in Using OpenCV, Python and Deep Learning for Face Recognition .

2. Implementation through python code

1. Installation

1) Install dlib under python3.10 in windows system 

Steins-Gate-Divergence-Meter-Clock-VisitorCounter/dlib-19.22.99-cp310-cp310-win_amd64.whl at main · longsongline/Steins-Gate-Divergence-Meter-Clock-VisitorCounter · GitHub

2) Install face_recognition library

pip3 install face_recognition

2. Code cases 

# coding=utf-8
import sys
import cv2
from PIL import Image,ImageDraw,ImageFont
import numpy as np
import face_recognition

# 加载已知人脸图像
known_image = face_recognition.load_image_file("know_img.jpg")

# 提取已知人脸的编码
known_face_encoding = face_recognition.face_encodings(known_image)[0]

# 初始化摄像头
video_capture = cv2.VideoCapture(0)

def cv2AddChineseText(frame, name, position, fill):
    font = ImageFont.truetype('simsun.ttc', 30)
    img_pil = Image.fromarray(frame)
    draw = ImageDraw.Draw(img_pil)
    draw.text(position, name, font=font, fill=fill)
    return np.array((img_pil))

while True:
    # 读取摄像头中的图像
    ret, frame = video_capture.read()

    # 将图像转换为RGB格式
    rgb_frame = frame[:, :, ::-1]

    # 检测图像中的人脸
    face_locations = face_recognition.face_locations(rgb_frame)
    face_encodings = face_recognition.face_encodings(rgb_frame, face_locations)

    # 在图像中标记人脸位置
    for (top, right, bottom, left), face_encoding in zip(face_locations, face_encodings):
        # 判断检测到的人脸是否和已知人脸匹配
        matches = face_recognition.compare_faces([known_face_encoding], face_encoding, tolerance=0.38)

        # 如果匹配,则标记人脸为已知人脸
        name = "unknow"
        if True in matches:
            name = "know"

        # 在图像中标记人脸位置和姓名
        cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)
        # cv2.putText(frame, name, (left + 6, bottom - 6), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 255), 1)
        frame = cv2AddChineseText(frame, name, (left, top - 38), (0, 0, 255))
        

    # 显示图像
    cv2.imshow('Video', frame)

    # 按下q键退出程序
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# 释放摄像头
video_capture.release()

# 关闭所有窗口
cv2.destroyAllWindows()

Guess you like

Origin blog.csdn.net/m0_68949064/article/details/130243639