Use convolutional neural network to build image classification model to detect pneumonia

In this article, I will outline how to use convolutional neural networks to build reliable image classification models to detect the presence of pneumonia from chest x-ray images.

Pneumonia is a common infection that inflames the air sacs in the lungs, causing symptoms such as difficulty breathing and fever. Although pneumonia is not difficult to treat, prompt diagnosis is essential. Without proper treatment, pneumonia can be fatal, especially in children and the elderly. Chest x-rays are an affordable method of diagnosing pneumonia. The development of a model that can reliably classify pneumonia based on X-ray images can reduce the burden on doctors in areas with high demand.

data

Kermany and his colleagues at the University of California, San Diego used deep learning chest X-rays and optical coherence tomography to actively identify diseases. We use the chest x-ray image provided in their study as our data set.

https://data.mendeley.com/datasets/rscbjbr9sj/3

data structure

The structure of the data folder should look like the following.

DATA
│ 
├── train
│    ├── NORMAL
│    └── PNEUMONIA
│
├── test
│    ├── NORMAL
│    └── PNEUMONIA
│
└── validation
     ├── NORMAL
     └── PNEUMONIA

After deleting the image files that have not been properly encoded, we have 5639 files in our data set. We use 15% of these images as the validation set and the other 15% as the test set. Our final training set includes 1076 normal cases and 2873 pneumonia cases.

Data exploration

Our exploratory data visualization shows that inflammation of the lungs often hinders the visibility of the heart and chest cavity, causing greater variability around the lungs.

Baseline model

As our baseline model, we will build a simple convolutional neural network, adjust the image to a square, and normalize all pixel values ​​to the range of 0 to 1, and then receive it. The complete steps are shown below.

from tensorflow.keras.preprocessing import image, image_dataset_from_directory
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.callbacks import EarlyStopping

# initiating generator that rescale and resize the images in a directory
train_g = image.ImageDataGenerator(rescale = 1/255).flow_from_directory(train_dir,
                                                                  target_size = (256,256), 
                                                                  color_mode='grayscale',
                                                                  class_mode='binary')
val_g = image.ImageDataGenerator(rescale = 1/255).flow_from_directory(val_dir,
                                                                target_size = (256,256), 
                                                                color_mode='grayscale',
                                                                class_mode='binary')

# setting up the architecture
model = models.Sequential()
model.add(layers.Conv2D(filters = 32, kernel_size = 3, 
                        activation = 'relu', padding = 'same', 
                        input_shape=(256, 256, 1)))
model.add(layers.MaxPooling2D(pool_size = (2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation = 'relu'))
model.add(layers.Dense(1, activation='sigmoid'))

# compiling models
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['accuracy', 'Recall'])

# setting up an early stopping callbacks to avoid overfitting
# stop if a validation loss is not reduced for 5 epochs 
cp = EarlyStopping(patience = 5, restore_best_weights=True)

# fitting the model
history = model.fit(train_g, # fit train generator
                    epochs=100, # it will be stopped before 100 epochs (early stopping)
                    validation_data = val_g, # use the assigned generator as a validation set
                    callbacks = [cp], # use cp as callback
                    verbose = 2 # report each epoch without progress bar
                   )
# evaluating the model
model.evaluate(val_g) # evaluate the best weight on validation set

Now I will explain each step in detail.

Zoom data

keras.image.ImageDataGenerator() gets the image and creates enhanced data based on the parameters. Here we just ask it to scale all pixel values ​​to 0 to 1, without specifying any other enhancement parameters. The generator is used in conjunction with flow_from_directory to call images from the directory in the specified format, and then create recalibrated data.

Build model architecture

keras.models.Sequential() starts a sequence model. This model will process the added layers in order.

Conv2D is a convolutional layer, which receives inputs and runs them through a specified number of filters. The kernel size refers to the size of the filter. Therefore, in this example, each of our 256 * 256 * 1 images (1 refers to the number of channels, RGB images have 3 channels, and grayscale images have 1 channel) each of them is a continuous 3 * 3 pixel group 32 feature maps will be generated through 32 filters, with a size of 256 * 256 * 1.

Since 256 is not divisible by 3, padding ='same' is used to add equal padding around our window.

activation ='relu' means that we set the activation function to relu. Simply put, we tell this layer to convert all negative values ​​to 0.

Then, we input these outputs of the convolutional layer into the pooling layer. MaxPooling2D abstracts the convolution output by keeping only the maximum value of each 2*2 matrix output by the convolution. Now we have 32 feature maps with a size of 128 * 128 * 1.

Now we need to reduce these 4-dimensional outputs to a single number, which can tell us whether the image is classified as pneumonia or normal. We first flatten this layer into a single dimension, and then run them in smaller and smaller dense layers. Apply an s-type function as the activation function in the last layer, because we now want the model to output a probability of whether the output is pneumonia.

Configuration

We have defined the architecture of the model. The next step is to determine the goal of this model and how we want it to be achieved. Using model.compile, we tell the model to use gradient descent to minimize the binary cross-entropy loss (logarithmic loss, logistic regression is basically similar). Here we use the RMSprop algorithm to optimize this process and adaptively reduce the learning rate. In the latter model, I used the AMSGrad algorithm, which performed better for our problem.

Fit data

Finally, we completed the construction of the model. Time to match our training data! By default, each epoch will run 32 batches. We set an early stop to prevent overfitting. If the loss is not reduced for 5 consecutive epochs, the model will stop running. I set restore_best_weights to true, so that it will be restored to the highest execution weight after these 5 epochs.

Verification and evaluation

Our first model shows that the accuracy of predicting the validation data class is 94%, and the loss is 0.11. As can be seen from the figure below, there is still room for improvement in training loss, so we may increase the complexity of the model. In addition, the verification loss seems to hover around 0.1. We can try to increase versatility by adding more data using data augmentation.

Here is a complete code to draw loss graph and accuracy graph from the fitted model.

import matplotlib.pyplot as plt
%matplotlib inline

def plot_performance(hist):
    ''' 
    takes the fitted model as input 
    plot accuracy and loss
    '''
    hist_ = hist.history
    epochs = hist.epoch
    
    plt.plot(epochs, hist_['accuracy'], label='Training Accuracy')
    plt.plot(epochs, hist_['val_accuracy'], label='Validation Accuracy')
    plt.title('Training and validation accuracy')
    plt.legend()
    
    plt.figure()
    plt.plot(epochs, hist_['loss'], label='Training loss')
    plt.plot(epochs, hist_['val_loss'], label='Validation loss')
    plt.title('Training and validation loss')
    plt.legend()
    
    plt.show()

Improve the model

Now, we will try to achieve data augmentation and add more complexity to our model.

# redefining training generator
data_aug_train = image.ImageDataGenerator(rescale = 1/255,
                                          # allow rotation withing 15 degree
                                          rotation_range = 15,
                                          # adjust range of brightness (1 = same)
                                          brightness_range = [0.9, 1.1],
                                          # allow shear by up to 5 degree
                                          shear_range=5,
                                          # zoom range of [0.8, 1.2]
                                          zoom_range = 0.2)
# attach generator to the directory
train_g2 = data_aug_train.flow_from_directory(train_dir,
                                              target_size = (256,256), 
                                              color_mode='grayscale',
                                              class_mode='binary')

# define architecture
model = models.Sequential()
model.add(layers.Conv2D(32, 3, activation = 'relu', padding = 'same', input_shape=(256, 256, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, 3, activation = 'relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, 3, activation = 'relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(256, 3, activation = 'relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(512, 3, activation = 'relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(2048, activation = 'relu'))
model.add(layers.Dense(1, activation='sigmoid'))

# configure
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.Adam(amsgrad = True),
              metrics=['accuracy'])

# train
history = model.fit(train_g2, 
                    epochs=100, # it won't run all 100
                    validation_data = val_g,
                    callbacks = [cp], 
                    verbose = 2
                   )
# evaluate
model.evaluate(val_g)

Data increase

This time, we added some parameters to the training image data generator. Therefore, now our generator will create a new image for each batch of images by applying different rotations, brightness, clipping, and scaling to the original image set within the specified range.

Model complexity

We also added three sets of convolutional layers and pooling layers to increase the complexity of the model. It is recommended to increase the number of convolution filters as the layers develop. This is because as we move through these layers, we try to extract more information and therefore require a larger set of filters. This analogy is similar to the way our brain processes visual information. When the signal moves from the retina to the optic chiasm, to the thalamus, to the primary visual cortex, and then through the inferior temporal cortex, the receptor area of ​​the neuron becomes larger at each step and becomes more and more sensitive to complex information.

Evaluation

Our second model showed 97.3% accuracy on the validation set with a loss of 0.075. It seems that our adjustments have indeed improved our model! Let's test it on the test set to make sure they can be well generalized to unseen data.

# create a test generator to apply rescaling
test_g = image.ImageDataGenerator(rescale = 1/255).flow_from_directory(test_dir,
                                                                target_size = (256,256), 
                                                                color_mode='grayscale',
                                                                class_mode='binary', 
                                                                shuffle=False)
# evaluate to get evaluation metrics
model.evaluate(test_g) 

# use predict to get the actual prediction
y_pred_prob = model.predict(test_g)
y_pred = [int(x) for x in y_pred_prob] 

Our model predicts the class of X_ray images in the test set with an accuracy of 97.8%. 97.9% of pneumonia cases were successfully detected.

in conclusion

Our model shows that according to our data set, using a convolutional neural network, it can correctly detect nearly 98% of pneumonia cases. But especially for life-threatening medical problems, even 2% of missed cases should not be simply ignored.

Author: Eunjoo Byeon

deephub translation team

Guess you like

Origin blog.csdn.net/m0_46510245/article/details/108572379