Code download

Github source code download address:
https://github.com/Kyrie-leon/CNN-FaceRec-keras

Three, face recognition data set

MegaFace face database is a million-level face database released and maintained by the University of Washington, and is currently one of the most authoritative face recognition performance indicators in the world. The face database has the characteristics of large amount of data and randomness of images. Some of the face data sets have a large age span, which increases the difficulty of recognition. To use this face database, you need to submit an application first, and you can download the dataset after obtaining the download account and password. This paper selects some faces of MegaFace face database as the experimental data set.
MegaFace face dataset official website
link: https://pan.baidu.com/s/1aCjxQnKrnGY6hG-Mifdm2g Extraction code: w5wu
Download the image
link of the face training set : https://pan.baidu.com/s/1rQEUk0toOSDYoG-frK4wBg Extraction code: 73uy download face test set image

3.1 Data set division

This experiment selects 6127 images of 40 people in the MegaFace face database as the experimental data set, including 4742 images in the training set, 1185 images in the validation set, and 200 images in the test set.

Name_label = [] #姓名标签
path = './data/face/'   #数据集文件路径
dir = os.listdir(path)  #列出所有人
label = 0   #设置计数器

#数据写入
with open('./data/train.txt','w') as f:
    for name in dir:
        Name_label.append(name)
        print(Name_label[label])
        after_generate = os.listdir(path +'\\'+ name)
        for image in after_generate:
            if image.endswith(".png"):
                f.write(image + ";" + str(label)+ "\n")
        label += 1

Through the above code, the experimental data set is written into the train.txt file to realize the iterative reading of the data, which reduces the memory usage and improves the training speed.

 # 打开数据集的txt
    with open(r".\data\train.txt","r") as f:
        lines = f.readlines()
    #打乱数据集
    np.random.seed(10101)
    np.random.shuffle(lines)
    np.random.seed(None)
    # 80%用于训练，20%用于测试。
    num_val = int(len(lines)*0.2)   #
    num_train = len(lines) - num_val  #4715

The above code is used to divide the data set and set a random seed so that the same kind of data will not be too concentrated to affect the accuracy of the experiment.

3.2 Data enhancement

By blurring, rotating, and adding noise to the original training set face images, a variety of complex "environments" are created, and the amount of data is expanded on the basis of the original data, thereby improving the recognition accuracy; by simulating poor imaging environments As well as interference, the greater the adaptability of the convolutional neural network to face recognition under different harsh conditions, the code is as follows:

def rand(a=0, b=1):
    return np.random.rand()*(b-a) + a

def get_random_data(image, input_shape, random=True, jitter=.1, hue=.1, sat=1.2, val=1.2, proc_img=True):
    h, w = input_shape

    new_ar = w/h * rand(1-jitter,1+jitter)/rand(1-jitter,1+jitter)
    scale = rand(.7, 1.3)
    if new_ar < 1:
        nh = int(scale*h)
        nw = int(nh*new_ar)
    else:
        nw = int(scale*w)
        nh = int(nw/new_ar)
    image = image.resize((nw,nh), Image.BICUBIC)


    # place image
    dx = int(rand(0, w-nw))
    dy = int(rand(0, h-nh))
    new_image = Image.new('RGB', (w,h), (0,0,0))
    new_image.paste(image, (dx, dy))
    image = new_image



    # flip image or not
    flip = rand()<.5
    if flip:
        image = image.transpose(Image.FLIP_LEFT_RIGHT)

    # distort image
    hue = rand(-hue, hue)
    sat = rand(1, sat) if rand()<.5 else 1/rand(1, sat)
    val = rand(1, val) if rand()<.5 else 1/rand(1, val)
    x = rgb_to_hsv(np.array(image)/255.)
    x[..., 0] += hue
    x[..., 0][x[..., 0]>1] -= 1
    x[..., 0][x[..., 0]<0] += 1
    x[..., 1] *= sat
    x[..., 2] *= val
    x[x>1] = 1
    x[x<0] = 0
    image_data = hsv_to_rgb(x)*255 # numpy array, 0 to 1

    return image_data

Effect picture:
Insert picture description here

Fourth, build a CNN network model

4.1 Build a CNN model

The convolutional neural network model built in this article is shown in the figure. The model has a total of 10 layers, consisting of 1 input layer, 3 convolutional layers, 3 maximum pooling layers, 1 Flatten layer, 1 fully connected layer and 1 It consists of a Softmax layer.
Insert picture description here

First, the input data of the input layer is a 200×200×3 picture matrix, the convolution kernel of the first convolution layer is 3×3, and the default sliding step size is 1 pixel. After the matrix undergoes the convolution operation of the convolution kernel, a matrix data of 198×198×64 is formed, and then the maximum pooling layer with a size of 2×2 and a default sliding step of 2 is used for pooling processing, and the features are reduced to 99×99×64 matrix data.

Then, the 99×99×64 matrix data output by the first maximum pooling layer is used as the input data of the second convolutional layer, and it passes through the second 3×3 size convolution kernel with a sliding step of 1 convolution After the layer is convolved, a matrix data of 97×97×32 is obtained, and then the matrix data is pooled with a maximum pooling layer of 2×2 size and a sliding step size of 2, and the feature is reduced to 48×48 ×32 feature matrix, the feature matrix is passed to the Dropout layer. The Dropout layer is used to suppress overfitting of the model, and the probability is set to 25%.

In the last layer of the convolutional layer and the maximum pooling layer, the process of convolution and pooling is similar to the previous two layers. The 48×48×32 feature matrix is convolved with the 3×3 size convolution kernel to get A 46×46×32 matrix data is passed into the 2×2 size maximum pooling layer for maximum pooling operation, the feature is finally reduced to a 23×23×3 feature matrix, and the probability of passing the matrix is 25% The Dropout layer to update the parameters.

After the data goes through the above-mentioned convolution pooling process, the output data is a two-dimensional vector, and the model needs to classify the results of the convolution operation, so the two-dimensional matrix needs to be reduced to a one-dimensional matrix, and Faltten is used here. The layer, which makes the multi-dimensional input data one-dimensional, provides a transition from the convolutional layer to the fully connected layer. In this step, the 23×23×32 feature matrix is "flattened" into a set of one-dimensional matrix, which contains a total of 16,928 eigenvalues, and these eigenvalues are passed to the fully connected layer.

The fully connected layer designed in this paper contains 128 neurons, and 16928 feature values are fully connected with 128 neurons. After relu function activation processing, 128 feature data are generated, and then passed into the Dropout layer with a probability of 50%. Input data for the Softmax layer.

Finally, in this paper, the 128 feature data of the fully connected layer are fully connected with the number of classification labels, and the face classification and recognition are performed through the Softmax layer. The implementation code is as follows:

def MmNet(input_shape, output_shape):
    model = Sequential()  # 建立模型
    # 第一层
    model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    # 第二层
    model.add(Conv2D(32, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))
    # 第三层
    model.add(Conv2D(32, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))
    # 全连接层
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(0.5))
    # 全连接层
    model.add(Dense(output_shape, activation='softmax'))
    print("-----------模型摘要----------\n")  # 查看模型摘要
    model.summary()

    return model

4.2 Model training

At the beginning of training, the traversal method is used to normalize the face images in each subdirectory to 200×200 pixels, and the images are converted into a matrix and read into the memory. The data set is divided into training set and Test set, and then add classification labels for face recognition to training set and test set respectively.

Next, start training the built neural network model. This article selects the Adam descent method as the optimizer, the learning rate is 0.01, and the GPU acceleration method is used for training. The processed data is passed into the neural network according to the set label, and iterative training is performed. In each iteration, the training result is evaluated and the trained network model is saved locally to make the iteration uninterrupted. The implementation code is as follows:

#3. 训练模型
def train(model, batch_size):

    model = model   #读取模型
    #定义保存方式，每三代保存一次
    checkpoint_period1 = ModelCheckpoint(
        log_dir + 'ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5',
        monitor='acc',
        save_weights_only=False,
        period=3
    )
    #学习率下降方式，acc三次不下降就下降学习率继续训练
    reduce_lr = ReduceLROnPlateau(
        monitor='val_acc',
        patience=3,
        verbose=1
    )
    #val_loss一直不下降意味模型基本训练完毕，停止训练
    early_stopping = EarlyStopping(
        monitor='val_loss',
        min_delta=0,
        patience=10,
        verbose=1
    )
    #交叉熵
    model.compile(loss = 'categorical_crossentropy',
                  optimizer =Adam(lr=1e-4),
                  metrics=['accuracy'])
    #Tebsorboard可视化
    tb_Tensorboard = TensorBoard(log_dir="./model", histogram_freq=1, write_grads=True)
    
    #开始训练
    history = model.fit_generator(generate_arrays_from_file(lines[:num_train], batch_size, True),
            steps_per_epoch=max(1, num_train//batch_size),
            validation_data=generate_arrays_from_file(lines[num_train:], batch_size, False),
            validation_steps=max(1, num_val//batch_size),
            verbose = 1,
            epochs=10,
            initial_epoch=0,
            callbacks=[early_stopping, checkpoint_period1, reduce_lr, tb_Tensorboard])
    return history, model

4.3 Load the face to be recognized

The code implementation process of this section is: load the trained neural network model, load the face image that needs to be recognized, recognize, print the recognition result and display the face photo in the window. The title of the window is the path name of the image. Specific code implementation

import os
import glob
import h5py
import keras
import numpy as np
from tkinter import *
from tkinter import ttk
from PIL import Image,ImageTk
from keras.models import load_model
from keras.preprocessing.image import load_img, img_to_array
from keras.applications.imagenet_utils import preprocess_input
from keras.models import Model

from Name import *

img_rows = 300 # 高
img_cols = 300 # 宽

def letterbox_image(image, size):
    iw, ih = image.size
    w, h = size
    scale = min(w/iw, h/ih)
    nw = int(iw*scale)
    nh = int(ih*scale)

    image = image.resize((nw,nh), Image.BICUBIC)
    new_image = Image.new('RGB', size, (0,0,0))
    new_image.paste(image, ((w-nw)//2, (h-nh)//2))

    return new_image
def test_accuracy(lines, model):
    t = 0
    n = len(lines)
    i = 0   #计数器
    # 遍历测试集数据
    for i in range(n):
        name = lines[i].split(';')[0]   #人脸图像名字xxx_xxx.png
        label_name = (lines[i].split(';')[1]).strip('\n')   #人脸数字标签0-39
        label_name = int(label_name)
        file_name = str(Name.get(label_name))  #对应人名str
        # 从文件中读取图像
        img = Image.open(r".\data\test" +"\\"+ name)
        img = np.array(letterbox_image(img,[img_rows,img_cols]),dtype = np.float64)
        img = preprocess_input(np.array(img).reshape(-1,img_cols,img_rows,3))
        pre_name = model.predict_classes(img) # 返回预测的标签值
        print(int(pre_name),label_name)
        if int(pre_name) == label_name:
            t+=1
    print(t/n)

def main():
    #获取模型权重h5文件
    model = load_model('./logs/easy1.h5')

    with open('./data/test.txt','r') as f:
        lines = f.readlines()
    for img_location in glob.glob('./data/test/*.png'): # 限制测试图像路径以及图像格式
        img = load_img(img_location)
        img = img_to_array(img)
        #图像处理
        img = preprocess_input(np.array(img).reshape(-1,img_cols,img_rows,3))
        img_name = (img_location.strip("face\\")).rstrip(".png") 
        pre_name = model.predict_classes(img) # 返回预测的标签值
        print(pre_name)

        pre = model.predict(img)

        for i in pre_name:
            for j in pre:
                name = Name.get(i)
                #print(name)
                # if name != "Audrey_Landers":
                acc = np.max(j) * 100
                print("\nPicture name is [%s]\npPredicted as [%s] with [%f%c]\n" %(img_name, name, acc, '%'))
                MainFrame = Tk()
                MainFrame.title(img_name)
                MainFrame.geometry('300x300')
                img = Image.open(img_location)
                img = ImageTk.PhotoImage(img)
                label_img = ttk.Label(MainFrame, image = img)
                label_img.pack()
                MainFrame.mainloop()
 
if __name__ == '__main__':
    main()

Keras Deep Learning Application 1-Face Recognition Based on Convolutional Neural Network (CNN) (Part 2)

CNN-based face recognition (below)