CNN achieve expression recognition

Author: Wei Zuchang

I. Background

2020 January 29, the Ministry of Education official said in an interview, said prevention and control of pneumonia novel coronavirus is currently of paramount importance, at all levels of the education sector is on the Ministry of Education and the local party committee and government unified deployment requirements, to prevent and control and resolutely prevent the spread of the epidemic in schools, delayed opening the school is one of the important initiatives. At the same time, education departments also serve to protect schools' teaching kept closed, non-stop learning "a lot of work during the epidemic prevention and control. Hence there are online teaching immediately.

But with the network promoted the teaching of conduct, classroom teachers can not be as timely as the state know that student learning, students will not be as seriously as in the classroom learning. Since it is online teaching, we can help some of our depth learning techniques to help teachers observe students face in order to solve this problem. This paper starts with CNN, we achieved a recognition of facial expression of the depth model.

Second, the data set

Fer2013 facial expression datasets from the 35,886 facial expression picture composition, wherein the test chart (Training) 28708 sheets, public validation map (PublicTest) and private verification chart (PrivateTest) each 3589, each picture is a fixed size grayscale images of 48 × 48, a total of seven expressions, corresponding respectively to the tag 0-6, the specific expression of the corresponding tag and English are as follows: 0 anger angry; 1 disgust aversion; 2 fear fear; 3 happy happy; 4 sad sad; 5 surprised surprised; 6 normal neutral.

At the same time, we need to flip the picture data, data, data rotation, image scaling, cropping, image pan, add noise. Some people will be curious to see here, and we do not have data yet? Why does it work this way? Then I came in to explain why so many operations.

In fact, this is called data enhancement. In depth study, the number of samples of the general requirements to be adequate, the greater the number of samples, the better trained model the effect of the stronger generalization ability of the model. In practice, however, an insufficient number of samples or sample quality is not good enough, it is necessary to do the sample data enhancement, to improve the quality of the sample. Enhanced data about the role summarized as follows:

  • Increase the amount of training data, improve the generalization ability of the model
  • Increased noise data to improve the robustness of the model

Enhancement of operational data

Enhanced data in operation: a FIG.

Third, the model structure

We first read data Fer2013, since he is a csv file from the gray image composition, and conventional training data so read up some differences. First, we need to read the csv file:

def read_data_np(path):
   with open(path) as f:
   content = f.readlines()
   lines = np.array(content)
   num_of_instances = lines.size
   print("number of instances: ", num_of_instances)
   print("instance length: ", len(lines[1].split(",")[1].split(" ")))
   return lines, num_of_instances

After reading the file, we need to read the data to predetermined processing, the image gray 48 48 * Reshape performed, and also require total data dividing data, divided x_train, y_train, x_test, y_test four parts :

def reshape_dataset(paths, num_classes):
    x_train, y_train, x_test, y_test = [], [], [], []

    lines, num_of_instances = read_data_np(paths)

    # ------------------------------
    # transfer train and test set data
    for i in range(1, num_of_instances):
        try:
            emotion, img, usage = lines[i].split(",")
    
            val = img.split(" ")

            pixels = np.array(val, 'float32')

            emotion = keras.utils.to_categorical(emotion, num_classes)

            if 'Training' in usage:
                y_train.append(emotion)
                x_train.append(pixels)
            elif 'PublicTest' in usage:
                y_test.append(emotion)
                x_test.append(pixels)
        except:
            print("", end="")

    # ------------------------------
    # data transformation for train and test sets
    x_train = np.array(x_train, 'float32')
    y_train = np.array(y_train, 'float32')
    x_test = np.array(x_test, 'float32')
    y_test = np.array(y_test, 'float32')

    x_train /= 255  # normalize inputs between [0, 1]
    x_test /= 255

    x_train = x_train.reshape(x_train.shape[0], 48, 48, 1)
    x_train = x_train.astype('float32')
    x_test = x_test.reshape(x_test.shape[0], 48, 48, 1)
    x_test = x_test.astype('float32')

    print(x_train.shape[0], 'train samples')
    print(x_test.shape[0], 'test samples')

    y_train = y_train.reshape(y_train.shape[0], 7)
    y_train = y_train.astype('int16')
    y_test = y_test.reshape(y_test.shape[0], 7)
    y_test = y_test.astype('int16')

    print('--------x_train.shape:', x_train.shape)
    print('--------y_train.shape:', y_train.shape)


    print(len(x_train), 'train x size')
    print(len(y_train), 'train y size')
    print(len(x_test), 'test x size')
    print(len(y_test), 'test y size')

    return x_train, y_train, x_test, y_test

Convolutional model is mainly used three layers, two fully connected layers, the final output of the likelihood of each category by a softmax. The main model code is as follows:

def build_model(num_classes):
  # construct CNN structure
  model = Sequential()
  # 1st convolution layer
  model.add(Conv2D(64, (5, 5), activation='relu', input_shape=(48, 48, 1)))
  model.add(MaxPooling2D(pool_size=(5, 5), strides=(2, 2)))

  # 2nd convolution layer
  model.add(Conv2D(64, (3, 3), activation='relu'))
  # model.add(Conv2D(64, (3, 3), activation='relu'))
  model.add(AveragePooling2D(pool_size=(3, 3), strides=(2, 2)))

  # 3rd convolution layer
  model.add(Conv2D(128, (3, 3), activation='relu'))
  # model.add(Conv2D(128, (3, 3), activation='relu'))
  model.add(AveragePooling2D(pool_size=(3, 3), strides=(2, 2)))

  model.add(Flatten())

  # fully connected neural networks
  model.add(Dense(1024, activation='relu'))
  model.add(Dropout(0.2))
  model.add(Dense(1024, activation='relu'))
  model.add(Dropout(0.2))

  model.add(Dense(num_classes, activation='softmax'))

return model

Fourth, the training results

Experimental set steps_per_epoch = 256, epoch = 5, the data is enhanced, the experimental results as shown below:
Data enhancement of training results

Figure 2: data enhancement training results

No data is enhanced, the experimental results as shown below:
There is no data to enhance the training results

Figure 3: no data enhancement training results

We can clearly see that we indeed did not enhance higher accuracy rate, it is indeed the case, we enhance the generalization ability of the model will inevitably reduce the accuracy rate of the model train, but we can by following a few extra pictures, to observe the difference between the two models. Results are as follows (left is the data enhancement, no enhancement of data is on the right):


Figure 4: Comparative Effect

We can find results on the map, happy figure, the data will enhance the data judged useless judged more accurately than at the same time enhancing the data model to determine when the Mona Lisa, more in line with the Mona Lisa Common sense of mystery.

V. Summary

In this paper, we simply use the CNN network, although the accuracy is not so high. But we can leverage our existing technology to help online teaching, to help teachers analyze student lectures status. At the same time if students know that there are a pair of invisible eyes on him, probably also to students in classroom instruction attitude towards online learning, improve teaching efficiency of online learning.

Project Address: https://momodel.cn/workspace/5e6752dd8efa61a905fef694?type=app (recommended Google Chrome browser on the computer end open)

Quote

  1. Deep Learning: Why should data enhancement?
  2. Reference Code: https://github.com/naughtybabyfirst/facial

About ##
Mo (URL: https: //momodel.cn) is a support Python artificial intelligence online modeling platform that can help you quickly develop, training and deployment model.

Recent Mo are ongoing related to machine learning introductory courses and thesis sharing activities, the public are welcome to look at our numbers for the latest information!

Published 36 original articles · won praise 4 · views 10000 +

Guess you like

Origin blog.csdn.net/weixin_44015907/article/details/105069022