Facial Emotion Recognition Using CNNs

Facial expressions are an important way for humans to communicate with each other.

In artificial intelligence research, deep learning techniques have become a powerful tool for enhancing human-computer interaction. The analysis and evaluation of facial expressions and emotions in psychology involves evaluating decisions that predict the emotions of individuals or groups.

This research aims to develop a system capable of predicting and classifying facial emotions using convolutional neural network (CNN) algorithms and feature extraction techniques.

The process consists of three main stages: data preprocessing, facial feature extraction, and facial emotion classification. By using the convolutional neural network (CNN) algorithm, the system accurately predicts facial expressions with a success rate of 62.66%.

The performance of the algorithm is evaluated using the FER2013 database, a publicly available dataset containing 35,887 48x48 grayscale facial images, each representing a different emotion.

Now let's start with coding.

!pip install scikit-plot

This code installs the scikit-plot package using pip, a Python package that provides a set of useful tools for visualizing the performance of machine learning models.

Specifically, scikit-plot provides a variety of functions to generate common plots used in model evaluation, such as ROC curves, precision-recall curves, confusion matrices, etc.

After executing the command "!pip install scikit-plot" in the Python environment, you should be able to import and use scikit-plot functions in your code.

import pandas as pd
import numpy as np
import scikitplot
import random
import seaborn as sns
import keras
import os

from matplotlib import pyplot
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras.utils import to_categorical
import warnings
from tensorflow.keras.models import Sequential

from keras.callbacks import EarlyStopping
from keras import regularizers
from keras.callbacks import ModelCheckpoint,EarlyStopping

from tensorflow.keras.optimizers import Adam,RMSprop,SGD,Adamax

from keras.preprocessing.image import ImageDataGenerator,load_img
from keras.utils.vis_utils import plot_model
from keras.layers import Conv2D, MaxPool2D, Flatten,Dense,Dropout,BatchNormalization,MaxPooling2D,Activation,Input

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
warnings.simplefilter("ignore")

from keras.models import Model

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

from keras.regularizers import l1, l2
import plotly.express as px
from matplotlib import pyplot as plt

from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

The code imports various Python libraries and modules commonly used in machine learning and deep learning tasks.

These libraries include pandas, numpy, scikit-plot, random, seaborn, keras, os, matplotlib, tensorflow, and scikit-learn.

Each import statement imports a specific set of tools or functions needed to perform machine learning or deep learning tasks, such as data manipulation, data visualization, model building, and performance evaluation.

Overall, this code prepares the necessary tools and modules needed to perform various machine learning and deep learning tasks such as data preprocessing, model training, and model evaluation.

Download the code from here: http://onepagecode.s3-website-us-east-1.amazonaws.com/

load dataset

data = pd.read_csv("../input/fer2013/fer2013.csv")
data.shape
fc64e6deddb1b8ae78caba81b5b82bd8.jpeg

This code uses a pandas read_csv()function to read a CSV file named "fer2013.csv" located in the "../input/fer2013/" directory and assigns the resulting data frame to a datavariable named .

Then, call attribute on the dataframe shapeto retrieve its dimensions, which will return a tuple of the form. dataThis line of code will output the number of rows and columns in the dataframe (rows, columns).

data.isnull().sum()
77870a2268d29adaac89718123e74fce.jpeg

This code will return the sum of all missing values ​​in each column of the dataframe data.

The methods of dataframe isnull()return a boolean dataframe that indicates whether each element in the original dataframe is missing. The method is then sum()applied to this boolean dataframe, which returns the sum of missing values ​​in each column.

Here's a quick way to check if there are any missing values ​​in a dataframe. If there are missing values, you may need to impute or remove these values ​​before using the data in modeling.

data.head()

This code will return the first 5 rows of the dataframe data.

The dataframe head()method returns the previous row of the dataframe n(by default n=5). This is a useful way to quickly browse data in a DataFrame, especially when working with large datasets.

The output will display the first 5 rows of the dataframe data, which may include column names and the first few rows of data, depending on the structure of the dataframe.

0de97532bb0360134d54e27edc3094ce.jpeg

header output

data preprocessing

CLASS_LABELS  = ['Anger', 'Disgust', 'Fear', 'Happy', 'Neutral', 'Sadness', "Surprise"]
fig = px.bar(x = CLASS_LABELS,
             y = [list(data['emotion']).count(i) for i in np.unique(data['emotion'])] , 
             color = np.unique(data['emotion']) ,
             color_continuous_scale="Emrld") 
fig.update_xaxes(title="Emotions")
fig.update_yaxes(title = "Number of Images")
fig.update_layout(showlegend = True,
    title = {
        'text': 'Train Data Distribution ',
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'})
fig.show()
fd66a992f8714264bcc066155603610c.jpeg

This code uses the Plotly Express library to create a bar chart that shows datathe distribution of sentiment in a data frame.

First, CLASS_LABELSdefine a list of class labels in , which correspond to the different sentiments in the dataset.

Then, call px.bar()the function where the x-axis represents the class labels and the y-axis represents the number of images for each emotion. The color parameter is set to different emotion classes, and the color_continuous_scale parameter is set to "Emrld", which is a predefined color scale in Plotly Express.

Next, various update_methods are called to modify the plot's layout and appearance. For example, update_xaxes()and update_yaxes()are used to set the x-axis and y-axis titles respectively. update_layout()Used to set the print title and its position.

Finally, methods are called on the graphics object show()to display the plot.

The output will display a bar chart showing datathe number of images for each sentiment in the data frame, with each sentiment color-coded according to the specified color scale.

Shuffle the data randomly

data = data.sample(frac=1)

The methods of DataFrame sample()are used to randomly sample a fraction of the rows in the data frame and specify fracthe portion of the rows to return (in this case frac=1, this means that all rows will be returned). At that timefrac=1 , sample()the method effectively shuffles the rows in the dataframe.

This is a common operation in machine learning and deep learning tasks, and it is important to randomly shuffle the data to prevent any bias that might be introduced if the data has any inherent order or structure.

One Hot Coding

labels = to_categorical(data[['emotion']], num_classes=7)

The output is a (n_samples, n_classes)numpy array of shape where:

  • n_samplesis the number of samples in the data frame

  • n_classesis the number of unique classes in the data (7 in this case)

  • Each row of the array datarepresents the One Hot encoded label for a single sample in the data frame.

train_pixels = data["pixels"].astype(str).str.split(" ").tolist()
train_pixels = np.uint8(train_pixels)

This code preprocesses the pixel values ​​in the pixel column of the data DataFrame.

First, astype()methods are used to pixelsconvert the column to a string data type, which allows split()the method to be called on each row of the column.

Next, a method is called on pixelseach row of the column split()to split the pixel values ​​into a list of strings. Then tolist()convert the resulting list to a numpy array using .

Finally, a call to the numpy array np.uint8()converts the pixel values ​​from strings to unsigned 8-bit integers, which is the data type typically used to represent image pixel values.

The output is a (n_samples, n_pixels)numpy array of shape where n_samplesis the number of samples in the data frame and n_pixelsis the number of pixels in each image in the data. Each row of the array datarepresents the pixel values ​​of a single image in the data frame.

standardization

pixels = train_pixels.reshape((35887*2304,1))

This code reshape the train_pixels numpy array from a 3D array of shape (n_samples, n_rows, n_columns) to a 2D array of shape (n_samples*n_row, 1).

The reshape() method of numpy array is used to change its shape. In this case, the train_pixels array is flattened by reshaping it into a 2D array with one column.

The resulting pixel array has shape (n_samples*n_rows, 1), where n_samples is the number of samples in the DataFrame, n_rows is the number of rows per image, and 1 represents the flattened pixel value for each image in the DataFrame. Each row of the array represents a single pixel value for a single image in the DataFrame.

scaler = StandardScaler()
pixels = scaler.fit_transform(pixels)

This code applies standardization to a pixel numpy array using scikit-learn's StandardScaler() function.

StandardScaler()function is a preprocessing step that scales each feature of the data (in this case, each pixel value) to have a mean of 0 and a variance of 1. This is a common technique in machine learning and deep learning tasks to ensure that each feature contributes equally to the model.

The fit_transform() method of the StandardScaler() object is then called on the pixel numpy array, which calculates the mean and standard deviation of the data and scales the data accordingly. The resulting scaled data is then assigned back to a pixel numpy array.

The output is a pixelsnumpy array of the same shape as the original array, but with each pixel value normalized.

Reshape Data (48, 48)

pixels = train_pixels.reshape((35887, 48, 48,1))

This code reshapes the train_pixels numpy array from a 2-dimensional array of shape (n_samples * n_rows, 1) to a 4-dimensional array of shape (n_samples, n_rows, n_columns, n_channels).

The reshape() method of numpy array is used to change its shape. In this case the train_pixels array is reshaped into a 4D array with 1 channel.

The resulting pixel array has shape (n_samples, n_rows, n_columns, n_channels), where n_samples is the number of samples in the DataFrame, n_row is the number of rows per image, n_column is the number of columns in each image, and n_channel represents each image The number of color channels in .

Since the original dataset is grayscale, n_channels is set to 1. Each element of the pixel array represents the pixel value of a single grayscale image in the DataFrame.

Train Test Validation Split

Now, we have 35887 images, each containing 48x48 pixels. We split the data into training, testing and validation data with 10% to feed, evaluate and validate our data.

X_train, X_test, y_train, y_test = train_test_split(pixels, labels, test_size=0.1, shuffle=False)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.1, shuffle=False)

The code uses scikit-learn's train_test_split() function to split preprocessed image data pixels and one-hot-encoded label labels into train, validation, and test sets.

The function train_test_split() randomly splits the data into training and test subsets according to the test_size parameter, and test_size specifies the part of the data that should be used for testing. In this case, test_size=0.1, which means 10% of the data will be used for testing.

The shuffle parameter is set to False to preserve the original order of samples in the DataFrame.

The resulting X_train, X_val, and X_test arrays contain pixel values ​​for the training set, validation set, and test set, respectively. The y_train, y_val, and y_test arrays contain the one-hot encoded labels for the corresponding sets.

Use train_test_split() again to further split the training set into a training set and a validation set, test_size=0.1. This splits the data into 80% for training, 10% for validation and 10% for testing.

print(X_train.shape)
print(X_test.shape)
print(X_val.shape)
e303947a76b4eebaa63c66abd5d95734.jpeg

After splitting the data into train, validation, and test sets, these lines of code print the shape of the X_train, X_test, and X_val arrays.

The shape attribute of a numpy array returns a tuple of the array's dimensions. In this case, the shape of the X_train, X_test, and X_val arrays will depend on the number of samples in each set and the dimensions of each sample.

The output will show the shape of the array in the format (n_samples, n_rows, n_columns, n_channel), where n_samples is the number of samples in the collection, n_rows is the number of rows per image, n_columns is the number of columns in each image, and n_channel represents each The number of color channels in the image.

With the help of this plotting code, we can see some training data containing one sample of each class.

plt.figure(figsize=(15,23))
label_dict = {0 : 'Angry', 1 : 'Disgust', 2 : 'Fear', 3 : 'Happiness', 4 : 'Sad', 5 : 'Surprise', 6 : 'Neutral'}
i = 1
for i in range (7):
    img = np.squeeze(X_train[i])
    plt.subplot(1,7,i+1)
    plt.imshow(img)
    index = np.argmax(y_train[i])
    plt.title(label_dict[index])
    plt.axis('off')
    i += 1
plt.show()
2e59146b443aafaddbed62dfc812631e.jpeg

This code uses matplotlib's plt.subplots()functions to create a 7x1 grid of subplots of images from the training set.

The scruze() method of numpy arrays is used to remove any one-dimensional entries from the shape of the array, effectively turning a 4D array into a 3D array.

For each subgraph, the imshow() function is used to display the corresponding image, and the title() function is used to display the corresponding label.

axis()function to turn off the axes for each subplot.

The output is a visualization of the first 7 images in the training set, along with their corresponding labels.

Data Augmentation Using Image Data Generator

We can do data augmentation to get more data to train and validate our model to prevent overfitting. Data augmentation can be done on both training and validation sets as it helps the model become more general and robust.

datagen = ImageDataGenerator(  width_shift_range = 0.1,
                               height_shift_range = 0.1,
                               horizontal_flip = True,
                               zoom_range = 0.2)
valgen = ImageDataGenerator(   width_shift_range = 0.1,
                               height_shift_range = 0.1,
                               horizontal_flip = True,
                               zoom_range = 0.2)

This code creates two ImageDataGenerator objects, datagen and valgen, which will be used for data augmentation during training and validation.

The ImageDataGenerator class is a Keras preprocessing utility that can perform various types of image augmentation such as shifting, flipping, rotating, and scaling in real time.

The datagen object includes a number of enhancements:

  • width_shift_range and height_shift_range randomly shift the image horizontally and vertically by up to 10% of the image width and height, respectively.

  • horizontal_flip randomly flips the image horizontally.

  • zoom_range randomly zooms the image up to a factor of 20%.

The valgen object contains the same augmentation techniques as datagen, but only applied to the validation set during training.

By applying data augmentation during training, the model is exposed to a larger and more diverse training dataset, which helps prevent overfitting and improves the model's ability to generalize to new data.

datagen.fit(X_train)
valgen.fit(X_val)

These lines of code fit the ImageDataGenerator objects datagen and valgen to the training and validation data, respectively.

The fit() method of an ImageDataGenerator object computes any internal statistics needed to perform data augmentation, such as the mean and variance of pixel values. In this case, the fit() method is called on datagen and valgen with the training and validation sets as input to compute these statistics.

After ImageDataGeneratorobjects are fitted to the data, they can be used to apply data augmentation in real-time during training and validation.

train_generator = datagen.flow(X_train, y_train, batch_size=64)
val_generator = datagen.flow(X_val, y_val, batch_size=64)

These lines of code create two ImageDataGenerator iterators, train_generator and val_generator, that can be used to generate a batch of augmented data during training and validation.

The flow() method of the ImageDataGenerator object receives input data and a numpy array of labels, and dynamically generates a batch of augmented data.

In this case, use the flow() method on datagen to create a train_generator, input training data X_train and y_train, and a batch size of 64. val_generator is created using the same method on valgen, input validation data X_val and y_val, batch size is 64.

During training, train_generator (iterator) will be used to dynamically generate a batch of augmented data for each training epoch. Similarly, the val_generator iterator will be used to generate a batch of augmented data for each validation epoch.

code download

http://onepagecode.s3-website-us-east-1.amazonaws.com/

design model

Convolutional Neural Network (CNN) Models

A CNN model has many layers with different units such as convolutional layers, max pooling layers, batch normalization and dropout layers to regularize the model.

def cnn_model():
  model= tf.keras.models.Sequential()
  model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu', input_shape=(48, 48,1)))
  model.add(Conv2D(64,(3,3), padding='same', activation='relu' ))
  model.add(BatchNormalization())
  model.add(MaxPool2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
  model.add(Conv2D(128,(5,5), padding='same', activation='relu'))
  model.add(BatchNormalization())
  model.add(MaxPool2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
      
  model.add(Conv2D(512,(3,3), padding='same', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
  model.add(BatchNormalization())
  model.add(MaxPool2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
  model.add(Conv2D(512,(3,3), padding='same', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
  model.add(BatchNormalization())
  model.add(MaxPool2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
  model.add(Conv2D(512,(3,3), padding='same', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
  model.add(BatchNormalization())
  model.add(MaxPool2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
  model.add(Flatten()) 
  model.add(Dense(256,activation = 'relu'))
  model.add(BatchNormalization())
  model.add(Dropout(0.25))
      
  model.add(Dense(512,activation = 'relu'))
  model.add(BatchNormalization())
  model.add(Dropout(0.25))
  model.add(Dense(7, activation='softmax'))
  model.compile(
    optimizer = Adam(lr=0.0001), 
    loss='categorical_crossentropy', 
    metrics=['accuracy'])
  return model

The code defines a Convolutional Neural Network (CNN) model using the Keras Sequential API.

The CNN architecture consists of several convolutional layers with batch normalization, max pooling and dropout regularization, followed by several fully connected (dense) layers with batch normalization and dropout. The last layer uses a softmax activation function to output a probability distribution over the 7 possible emotion categories.

The Conv2D layer creates a convolution kernel that is convolved with the layer input to produce an output tensor.

The BatchNormalization layer applies a transformation that keeps the mean activation close to 0 and the activation standard deviation close to 1.

The MaxPooling2D layer downsamples the input along the spatial dimension.

The dropout layer randomly drops some units during training to prevent overfitting.

Dense layers flatten the input before feeding it into fully connected layers.

The model's compile() method specifies the optimizer, loss function, and evaluation metric to use during training. In this case, the optimizer is Adam, the learning rate is 0.0001, the loss function is categorical cross-entropy, and the evaluation metric is accuracy.

This function returns the compiled model object.

model = cnn_model()

This line of code creates a new instance of the CNN model by calling the CNN_model() function defined earlier.

A model object represents a neural network model that can be trained on data to predict emotion labels for facial images.

We then compile our model using the Adam optimizer with a learning rate of 0.0001 and choose the metric as accuracy and the loss as categorical cross-entropy

model.compile(
    optimizer = Adam(lr=0.0001), 
    loss='categorical_crossentropy', 
    metrics=['accuracy'])

This line of code compiles the CNN model by specifying the optimizer, loss function, and evaluation metric to use during training.

The methods of the Keras model compile()are used to configure the learning process before training. In this case, the optimizer is Adam with a learning rate of 0.0001, the loss function is categorical cross-entropy, and the evaluation metric is accuracy.

The optimizer is responsible for updating the model parameters during training, Adam is a popular optimization algorithm that adjusts the learning rate according to the gradient of the loss function.

The loss function is used to calculate the difference between the predicted label and the actual label, and categorical cross-entropy is a standard loss function for multi-class classification problems. Accuracy metrics are used to evaluate a model's performance during training and validation.

model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_6 (Conv2D)            (None, 48, 48, 32)        320       
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 48, 48, 64)        18496     
_________________________________________________________________
batch_normalization_7 (Batch (None, 48, 48, 64)        256       
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 24, 24, 64)        0         
_________________________________________________________________
dropout_7 (Dropout)          (None, 24, 24, 64)        0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 24, 24, 128)       204928    
_________________________________________________________________
batch_normalization_8 (Batch (None, 24, 24, 128)       512       
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 12, 12, 128)       0         
_________________________________________________________________
dropout_8 (Dropout)          (None, 12, 12, 128)       0         
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 12, 12, 512)       590336    
_________________________________________________________________
batch_normalization_9 (Batch (None, 12, 12, 512)       2048      
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 6, 6, 512)         0         
_________________________________________________________________
dropout_9 (Dropout)          (None, 6, 6, 512)         0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 6, 6, 512)         2359808   
_________________________________________________________________
batch_normalization_10 (Batc (None, 6, 6, 512)         2048      
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 3, 3, 512)         0         
_________________________________________________________________
dropout_10 (Dropout)         (None, 3, 3, 512)         0         
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 3, 3, 512)         2359808   
_________________________________________________________________
batch_normalization_11 (Batc (None, 3, 3, 512)         2048      
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 1, 1, 512)         0         
_________________________________________________________________
dropout_11 (Dropout)         (None, 1, 1, 512)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 256)               131328    
_________________________________________________________________
batch_normalization_12 (Batc (None, 256)               1024      
_________________________________________________________________
dropout_12 (Dropout)         (None, 256)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 512)               131584    
_________________________________________________________________
batch_normalization_13 (Batc (None, 512)               2048      
_________________________________________________________________
dropout_13 (Dropout)         (None, 512)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 7)                 3591      
=================================================================
Total params: 5,810,183
Trainable params: 5,805,191
Non-trainable params: 4,992
_________________________________________________________________

This line of code prints a summary of the CNN model architecture.

Methods for Keras models summary()print a summary of the model architecture, including the parameters in each layer, the output shape of each layer, and the total number of parameters in the model.

The summary includes information about each layer in the model, including layer type, output shape, number of parameters, and activation function (if applicable). The summary also includes information about the total number of trainable parameters in the model, which is useful for understanding model complexity and potential for overfitting.

stop early

Added check pointers to implement early stopping to prevent overfitting.

checkpointer = [EarlyStopping(monitor = 'val_accuracy', verbose = 1, 
                              restore_best_weights=True,mode="max",patience = 5),
                ModelCheckpoint('best_model.h5',monitor="val_accuracy",verbose=1,
                                save_best_only=True,mode="max")]

This code defines a list of Keras callbacks that will be used during CNN model training.

Callbacks are functions that can be applied at various stages in the training process, such as at the end of each epoch or when validation accuracy reaches a certain threshold. They can be used to perform operations such as saving the best model weights, stopping early to prevent overfitting, or reducing the learning rate when the model is not improving.

In this case, checkpointerthe list contains two callbacks:

  1. EarlyStopping: This callback monitors the validation accuracy and patiencestops the training process if the accuracy does not improve after a certain number of epochs (specified by the parameter). restore_best_weightsParameter set to True to restore the weights of the best model after stopping training.

  2. ModelCheckpoint: This callback saves the weights of the best model during training to best_model.h5a file named . save_best_onlyParameter set to True to save only the weights that lead to the highest validation accuracy.

history = model.fit(train_generator,
                    epochs=30,
                    batch_size=64,   
                    verbose=1,
                    callbacks=[checkpointer],
                    validation_data=val_generator)

This code trains a CNN model on the training data using the fit() method.

The Keras model fit()method trains a model on the input data for a specified number of epochs. In this case, the model is trained for 64 epochs at a time using batches of 30 images.

The train_generator and val_generator objects are used to generate a batch of augmented images for training and validation, respectively. The callback parameter is set to a list of check pointers defined earlier specifying the early stopping and model checkpoint callbacks to use during training.

fit()The object returned by this method historycontains information about the training process, including training and validation loss and accuracy for each epoch. This information can be used to visualize the performance of the model over time and make decisions about further training or model tuning.

visualize the results

plt.plot(history.history["loss"],'r', label="Training Loss")
plt.plot(history.history["val_loss"],'b', label="Validation Loss")
plt.legend()
96b07ba5442c4596a12894b786dbabbf.jpeg

This code plots the training and validation loss of a CNN model during training.

fit()The object returned by the method historycontains information about the training process, including training and validation loss and accuracy for each epoch. This information can be used to visualize the performance of the model over time.

The plt.plot() function is used to plot the training loss in red and the validation loss in blue, and the label parameter specifies the legend label for each row. The legend() function is called to display the legend on the plot.

This code allows us to see how well the model is learning and whether it is overfitting to the training data. If the validation loss starts to increase while the training loss continues to decrease, it indicates overfitting, which means the model is memorizing the training data and not generalizing well to new data.

plt.plot(history.history["accuracy"],'r',label="Training Accuracy")
plt.plot(history.history["val_accuracy"],'b',label="Validation Accuracy")
plt.legend()
d6546d2ad387fd34bbc62aaae3e9a00d.jpeg

This code plots the training and validation accuracy of the CNN model during training.

fit()The object returned by the method historycontains information about the training process, including training and validation loss and accuracy for each epoch. This information can be used to visualize the performance of the model over time.

The plt.plot() function is used to plot the training accuracy in red and the validation accuracy in blue, and the label parameter specifies the legend label for each row. The legend() function is called to display the legend on the plot.

This code allows us to see how well the model is learning and whether it is overfitting to the training data. If validation accuracy starts to drop while training accuracy continues to increase, it indicates overfitting, which means the model is memorizing the training data rather than generalizing well to new data.

loss = model.evaluate(X_test,y_test) 
print("Test Acc: " + str(loss[1]))

This code evaluates the performance of a trained CNN model on a test set.

Methods for Keras models evaluate()to compute loss and metrics on a given test set (specified during model compilation).

The X_test and y_test arrays contain the test images and their corresponding labels, respectively. The model.evaluate() method is used to calculate the loss and accuracy of the model on the test set. The evaluate() method returns loss and precision values ​​as an array.

Test accuracy is printed using the loss object returned by model.eevaluate(). Test accuracy gives us an idea of ​​how well the model performs on new, unseen data.

3aef79cb1ae2384228203e71b0ae6e8b.jpeg
preds = model.predict(X_test)
y_pred = np.argmax(preds , axis = 1 )

This code uses the trained CNN model to generate predictions for the test set.

The methods of a Keras model generate predictions predict()for a given input dataset . X_testIn this case, the array contains the test images we want to make predictions on.

predsThe array contains the predicted probabilities for each class for each test image, with each row of the array corresponding to a test image and each column corresponding to a class.

np.argmax()function to extract the index of the class with the highest predicted probability for each test image. This gives us the predicted class labels for the test set.

The predicted class labels can y_testbe compared with the true class labels to evaluate the performance of the model on the test set.

label_dict = {0 : 'Angry', 1 : 'Disgust', 2 : 'Fear', 3 : 'Happiness', 4 : 'Sad', 5 : 'Surprise', 6 : 'Neutral'}
figure = plt.figure(figsize=(20, 8))
for i, index in enumerate(np.random.choice(X_test.shape[0], size=24, replace=False)):
    ax = figure.add_subplot(4, 6, i + 1, xticks=[], yticks=[])
    ax.imshow(np.squeeze(X_test[index]))
    predict_index = label_dict[(y_pred[index])]
    true_index = label_dict[np.argmax(y_test,axis=1)[index]]
    
    ax.set_title("{} ({})".format((predict_index), 
                                  (true_index)),
                                  color=("green" if predict_index == true_index else "red"))
ae73f0ce9776c97f328fee9c972ae473.jpeg

This code generates a random subset of test images and a visualization of their true and predicted labels.

The label_dict dictionary maps integer class labels to their corresponding string labels.

The code then generates a graph with 24 subplots (4 rows and 6 columns), each showing a random test image and its predicted and true labels. What the function does is randomly select 24 indices from the X_test array.

For each subplot, use imshow() to display the test image, and use set_title() to display the predicted and true labels in the title of the subplot. If the predicted label matches the true label, it is highlighted in green, otherwise it is highlighted in red.

CLASS_LABELS  = ['Anger', 'Disgust', 'Fear', 'Happy', 'Neutral', 'Sadness', "Surprise"]
cm_data = confusion_matrix(np.argmax(y_test, axis = 1 ), y_pred)
cm = pd.DataFrame(cm_data, columns=CLASS_LABELS, index = CLASS_LABELS)
cm.index.name = 'Actual'
cm.columns.name = 'Predicted'
plt.figure(figsize = (15,10))
plt.title('Confusion Matrix', fontsize = 20)
sns.set(font_scale=1.2)
ax = sns.heatmap(cm, cbar=False, cmap="Blues", annot=True, annot_kws={"size": 16}, fmt='g')

This code generates a heatmap of the confusion matrix predicted by the model on the test set.

CLASS_LABELSThe list contains the names of seven emotion classes.

The confusion_matrix() function in the metrics module of scikit learn is used to calculate the confusion matrix predicted by the model on the test set. The function np.argmax() is used to convert the one-hot encoded true and predicted labels into integer labels.

The resulting confusion matrix is ​​stored in a pandas DataFrame cm with class names as row and column labels. The DataFrame is then displayed as a heatmap using seaborn's heatmap() function. The heatmap is annotated with the values ​​of the confusion matrix and the font size is increased using sns.set(font_scale=1.2).

83133b437ba7bcbdc32ebf7ee080aa99.jpeg
from sklearn.metrics import classification_report
print(classification_report(np.argmax(y_test, axis = 1 ),y_pred,digits=3))
da01816f8751086b3d05996b43cb31fc.jpeg

classification_report()The meaning of the output is as follows:

  • Precision: The ratio of predicted positive cases to actual positive cases. Mathematically, it is TP/(TP+FP), where TP is the number of true positives and FP is the number of false positives. A high precision score indicates that the model is accurate in predicting the positive class.

  • Recall: The proportion of actual positive cases that the model correctly predicted as positive. Mathematically, it is TP/(TP+FN), where FN is the number of false negatives. A high recall score indicates that the model was able to correctly identify positive cases.

  • F1 score: Harmonized mean of precision and recall. It takes into account both precision and recall and provides a single score that balances both. Mathematically, it is 2 * (precision * recall) / (precision + recall).

  • Support: The number of times the class actually occurs in the test data.

The following are descriptions of the different rows in the classification report:

  • Precision: The first row shows the precision score for each class.

  • Recall: The second row shows the recall score for each class.

  • F1 Score: The third row shows the F1 score for each class.

  • Support: The last line shows the number of occurrences of each class in the test data.

Note that the macro and weighted averages are also at the bottom of the report.

Fine-tuning the model

model = cnn_model()
model.compile(optimizer=tf.keras.optimizers.SGD(0.001),
                loss='categorical_crossentropy',
                metrics = ['accuracy'])

The optimizer for this model has been changed from Adam to SGD with a learning rate of 0.001. The loss function remains the same, i.e. category cross-entropy. The precision measure is also the same.

history = model.fit(train_generator,
                    epochs=30,
                    batch_size=64,   
                    verbose=1,
                    callbacks=[checkpointer],
                    validation_data=val_generator)

Using train_generator and val_generator as training and validation data respectively, the model is trained again for 30 epochs with a batch size of 64.

The checkpointer callback is also used to save the best model based on validation accuracy.

loss = model.evaluate(X_test,y_test) 
print("Test Acc: " + str(loss[1]))
fa5176847731acaadb57d64ed83be725.jpeg

Print the test accuracy after fine-tuning the model using the SGD optimizer with a learning rate of 0.001.

plt.plot(history.history["loss"],'r', label="Training Loss")
plt.plot(history.history["val_loss"],'b', label="Validation Loss")
plt.legend()
93524ff58c4d698bb9452b4472acc272.jpeg

The graph shows training loss (red) and validation loss (blue) at various epochs during model training. The x-axis represents the number of cycles and the y-axis represents the loss. It helps determine whether the model is overfitting or underfitting.

If the training loss is decreasing but the validation loss is increasing or not, it means the model is overfitting. If both training and validation loss are high, it means the model is underfitting.

From the graph, the training and validation losses seem to be decreasing, which means the model is learning from the data.

Change epoch number

model.compile(
    optimizer = Adam(lr=0.0001), 
    loss='categorical_crossentropy', 
    metrics=['accuracy'])
checkpointer = [EarlyStopping(monitor = 'val_accuracy', verbose = 1, 
                              restore_best_weights=True,mode="max",patience = 10),
                              ModelCheckpoint('best_model.h5',monitor="val_accuracy",verbose=1,
                              save_best_only=True,mode="max")]
history = model.fit(train_generator,
                    epochs=50,
                    batch_size=64,   
                    verbose=1,
                    callbacks=[checkpointer],
                    validation_data=val_generator)

The updated code trains the model again for 50 epochs, stops the callback early, and waits patiently for 10 epochs. The best model will be saved as "best_model.h5" with the largest validation accuracy.

The model will be compiled using the Adam optimizer with a learning rate of 0.0001 and categorical cross-entropy loss and accuracy as metrics. The training and validation generators were previously defined using data augmentation techniques.

loss = model.evaluate(X_test,y_test) 
print("Test Acc: " + str(loss[1]))
020d46f725f2287642bab85804587461.jpeg
CLASS_LABELS  = ['Anger', 'Disgust', 'Fear', 'Happy', 'Neutral', 'Sadness', "Surprise"]
cm_data = confusion_matrix(np.argmax(y_test, axis = 1 ), y_pred)
cm = pd.DataFrame(cm_data, columns=CLASS_LABELS, index = CLASS_LABELS)
cm.index.name = 'Actual'
cm.columns.name = 'Predicted'
plt.figure(figsize = (20,10))
plt.title('Confusion Matrix', fontsize = 20)
sns.set(font_scale=1.2)
ax = sns.heatmap(cm, cbar=False, cmap="Blues", annot=True, annot_kws={"size": 16}, fmt='g')
f2f01705392901eb71f2c96e887fa2fe.jpeg

☆ END ☆

If you see this, it means you like this article, please forward and like it. Search "uncle_pn" on WeChat, welcome to add the editor's WeChat "woshicver", and update a high-quality blog post in the circle of friends every day.

Scan the QR code to add editor↓

c5835b7e34dd04a5612a3b450caa285b.jpeg

Guess you like

Origin blog.csdn.net/woshicver/article/details/130737957