CNN instance expression recognition

Thank refer to the original -http: //bjbsair.com/2020-03-27/tech-info/7032/
I. Background

2020 January 29, the Ministry of Education official said in an interview, said prevention and control of pneumonia novel coronavirus is currently of paramount importance, at all levels of the education sector is on the Ministry of Education and the local party committee and government unified deployment requirements, to prevent and control and resolutely prevent the spread of the epidemic in schools, delayed opening the school is one of the important initiatives. At the same time, education departments also serve to protect schools' teaching kept closed, non-stop learning "a lot of work during the epidemic prevention and control. Hence there are online teaching immediately.

But with the network promoted the teaching of conduct, classroom teachers can not be as timely as the state know that student learning, students will not be as seriously as in the classroom learning. Since it is online teaching, we can help some of our depth learning techniques to help teachers observe students face in order to solve this problem. This paper starts with CNN, we achieved a recognition of facial expression of the depth model.

Second, the data set

Fer2013 facial expression data set consists of 35,886 facial expression picture composition, wherein the test chart (Training) 28708 sheets, public validation map (PublicTest) and private verification chart (PrivateTest) each 3589, each picture is a fixed size grayscale images of 48 × 48, a total of seven expressions, corresponding respectively to the tag 0-6, the specific expression of the corresponding tag and English are as follows: 0 anger angry; 1 disgust aversion; 2 fear fear; 3 happy happy; 4 sad sad; 5 surprised surprised; 6 normal neutral.

At the same time, we need to flip the picture data, data, data rotation, image scaling, cropping, image pan, add noise. Some people will be curious to see here, and we do not have data yet? Why does it work this way? Then I came in to explain why so many operations.

In fact, this is called data enhancement. In depth study, the number of samples of the general requirements to be adequate, the greater the number of samples, the better trained model the effect of the stronger generalization ability of the model. In practice, however, an insufficient number of samples or sample quality is not good enough, it is necessary to do the sample data enhancement, to improve the quality of the sample. Enhanced data about the role summarized as follows:

  • Increase the amount of training data, improve the generalization ability of the model
  • Increased noise data to improve the robustness of the model

CNN achieve expression recognition

Enhanced data in operation: a FIG.

Third, the model structure

We first read data Fer2013, since he is a csv file from the gray image composition, and conventional training data so read up some differences. First, we need to read the csv file:

def read_data_np(path):  
   with open(path) as f:  
   content = f.readlines()  
   lines = np.array(content)  
   num_of_instances = lines.size  
   print("number of instances: ", num_of_instances)  
   print("instance length: ", len(lines[1].split(",")[1].split(" ")))  
   return lines, num_of_instances

After reading the file, we need to read the data to predetermined processing, the image gray 48 48 * Reshape performed, and also require total data dividing data, divided x_train, y_train, x_test, y_test four parts :

def reshape_dataset(paths, num_classes):  
    x_train, y_train, x_test, y_test = [], [], [], []  
  
    lines, num_of_instances = read_data_np(paths)  
  
    # ------------------------------  
    # transfer train and test set data  
    for i in range(1, num_of_instances):  
        try:  
            emotion, img, usage = lines[i].split(",")  
      
            val = img.split(" ")  
  
            pixels = np.array(val, 'float32')  
  
            emotion = keras.utils.to_categorical(emotion, num_classes)  
  
            if 'Training' in usage:  
                y_train.append(emotion)  
                x_train.append(pixels)  
            elif 'PublicTest' in usage:  
                y_test.append(emotion)  
                x_test.append(pixels)  
        except:  
            print("", end="")  
  
    # ------------------------------  
    # data transformation for train and test sets  
    x_train = np.array(x_train, 'float32')  
    y_train = np.array(y_train, 'float32')  
    x_test = np.array(x_test, 'float32')  
    y_test = np.array(y_test, 'float32')  
  
    x_train /= 255  # normalize inputs between [0, 1]  
    x_test /= 255  
  
    x_train = x_train.reshape(x_train.shape[0], 48, 48, 1)  
    x_train = x_train.astype('float32')  
    x_test = x_test.reshape(x_test.shape[0], 48, 48, 1)  
    x_test = x_test.astype('float32')  
  
    print(x_train.shape[0], 'train samples')  
    print(x_test.shape[0], 'test samples')  
  
    y_train = y_train.reshape(y_train.shape[0], 7)  
    y_train = y_train.astype('int16')  
    y_test = y_test.reshape(y_test.shape[0], 7)  
    y_test = y_test.astype('int16')  
  
    print('--------x_train.shape:', x_train.shape)  
    print('--------y_train.shape:', y_train.shape)  
  
  
    print(len(x_train), 'train x size')  
    print(len(y_train), 'train y size')  
    print(len(x_test), 'test x size')  
    print(len(y_test), 'test y size')  
  
    return x_train, y_train, x_test, y_test

Convolutional model is mainly used three layers, two fully connected layers, the final output of the likelihood of each category by a softmax. The main model code is as follows:

def build_model(num_classes):  
  # construct CNN structure  
  model = Sequential()  
  # 1st convolution layer  
  model.add(Conv2D(64, (5, 5), activation='relu', input_shape=(48, 48, 1)))  
  model.add(MaxPooling2D(pool_size=(5, 5), strides=(2, 2)))  
  
  # 2nd convolution layer  
  model.add(Conv2D(64, (3, 3), activation='relu'))  
  # model.add(Conv2D(64, (3, 3), activation='relu'))  
  model.add(AveragePooling2D(pool_size=(3, 3), strides=(2, 2)))  
  
  # 3rd convolution layer  
  model.add(Conv2D(128, (3, 3), activation='relu'))  
  # model.add(Conv2D(128, (3, 3), activation='relu'))  
  model.add(AveragePooling2D(pool_size=(3, 3), strides=(2, 2)))  
  
  model.add(Flatten())  
  
  # fully connected neural networks  
  model.add(Dense(1024, activation='relu'))  
  model.add(Dropout(0.2))  
  model.add(Dense(1024, activation='relu'))  
  model.add(Dropout(0.2))  
  
  model.add(Dense(num_classes, activation='softmax'))  
  
return model

Fourth, the training results

Experimental set steps_per_epoch = 256, epoch = 5, the data is enhanced, the experimental results as shown below:

CNN achieve expression recognition

Figure 2: data enhancement training results

No data is enhanced, the experimental results as shown below:

CNN achieve expression recognition

Figure 3: no data enhancement training results

We can clearly see that we indeed did not enhance higher accuracy rate, it is indeed the case, we enhance the generalization ability of the model will inevitably reduce the accuracy rate of the model train, but we can by following a few extra pictures, to observe the difference between the two models. Results are as follows (data enhancement is left, the right is no data enhancement):

CNN achieve expression recognition

CNN achieve expression recognition

Figure 4: Effect Comparison

We can find results on the map, happy figure, the data will enhance the data judged useless judged more accurately than at the same time enhancing the data model to determine when the Mona Lisa, more in line with the Mona Lisa Common sense of mystery.

V. Summary

In this paper, we simply use the CNN network, although the accuracy is not so high. But we can leverage our existing technology to help online teaching, to help teachers analyze student lectures status. At the same time if students know that there are a pair of invisible eyes on him, probably also to students in classroom instruction attitude towards online learning, improve teaching efficiency of online learning.

Original articles published 0 · won praise 0 · Views 271

Guess you like

Origin blog.csdn.net/zxjoke/article/details/105139665