The ResNet50 model recognizes two-dimensional ECG signals - taking the MIT-BIH arrhythmia database as an example


During the three-day Mid-Autumn Festival holiday, there just wasn't much going on, so I thought about realizing my previous idea - converting one-dimensional signals into two dimensions. The data set used in this blog is the MIT-BIH Arrhythmia Database (used by the author in his undergraduate graduation project. The remaining data saves the author the preprocessing operations of denoising and normalization). However, due to the computer configuration, there is no way to use all the data for operation, so the final result is relatively poor and not obviously convincing. This blog is only for learning the algorithm of using one-dimensional signal to two-dimensional and calling the classic model.

One-dimensional signal becomes two-dimensional

Although the current classification algorithms for one-dimensional signals are relatively mature, as a popular field of deep learning, the maturity of image recognition algorithms is obviously higher than that of one-dimensional signals. Therefore, we can consider converting one-dimensional signals into two-dimensional images and use image recognition algorithms for recognition. The article "Encoding Time Series as Image for Visual Inspection and Classification Using Tiled Convolutional Neural Network" proposes several methods to achieve such operations.

grammy corner field

Implementation steps

1. Normalize, compress the time series signal of the Cartesian coordinate system to [0, 1] or [-1, 1];
2. Convert the scaled sequence data to the polar coordinate system, that is, regard the values ​​as angles Cosine value, timestamp treated as radius.
The Grammi angle field converts the time series of the Cartesian coordinate system into the polar coordinate system, which is easier to understand. Its usage is similar to the Markov transition field mentioned below, so it is not shown here.

Markov transition field

Implementation steps

1. According to the given time series X, Q quantile bins are divided by the value range. Each value xi
of The transfer matrix W, the matrix size is: [Q, Q], where W[i,j] is determined by the frequency of the data in qi being immediately adjacent to the data in qj;
3. Construct the Markov transition field M, as follows:
Insert image description here

Code

It is difficult to understand the Markov transition field, but it is relatively simple to implement it in code. Just call the API provided by pyts directly:

import numpy as np
from pyts.image import MarkovTransitionField

'''
这里的data变量就是一维序列,我这里怕误导所以没给定义
'''
x = data
mtf = MarkovTransitionField()
image = mtf.transform(x)

That's it, and then below is the two-dimensional picture I converted from the ECG signal.
Category N:
Insert image description here
Category S:
Insert image description here
Category V:
Insert image description here
Another method of recursive graph is also mentioned in the paper. I don’t understand it very well, so I won’t put it here for now. Interested students can read the original article.
Original link: https://arxiv.org/pdf/1506.00327v1.pdf

ResNet50 model introduction

For the basic concepts of residual networks, you can refer to my previous blog. I will not go into details here. The code is only for learning and using the pre-trained model. ResNet50 is the smallest and most primitive of the trained models provided by keras, and its reference is also very simple. The calling method is given in the keras documentation:

keras.applications.resnet.ResNet50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
keras.applications.resnet.ResNet101(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
keras.applications.resnet.ResNet152(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
keras.applications.resnet_v2.ResNet50V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
keras.applications.resnet_v2.ResNet101V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
keras.applications.resnet_v2.ResNet152V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Insert image description here
The structure of the model is as shown on the right:
Insert image description here

Code

Let’s return to the topic and use this model to identify two-dimensional ECG signals:

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow import keras
from keras.optimizers import adam_v2

tf.compat.v1.disable_eager_execution()

# 混淆矩阵
def plotHeatMap(Y_test, Y_pred):
    con_mat = confusion_matrix(Y_test, Y_pred)
    # 归一化
    # con_mat_norm = con_mat.astype('float') / con_mat.sum(axis=1)[:, np.newaxis]
    # con_mat_norm = np.around(con_mat_norm, decimals=2)
    ecgClassSet = ['N', 'S', 'V', 'F']
    # 绘图
    plt.figure(figsize=(8, 8))
    seaborn.heatmap(con_mat, annot=True, fmt='.20g', cmap='Blues')
    plt.xlim(0,4)
    plt.ylim(0,4)
    plt.xticks([0.5,1.5,2.5,3.5],ecgClassSet)
    plt.yticks([0.5,1.5,2.5,3.5],ecgClassSet)
    plt.xlabel('Predicted labels')
    plt.ylabel('True labels')
    plt.show()
def builtModel():
    Input = tf.keras.layers.Input(shape=(300,300,3))
    model = keras.applications.resnet.ResNet50(include_top=True, weights=None, input_tensor=Input,classes=4)

    return model
def main():

    X_train = np.load('E:/1dto2d/1dto2d_data/X_train.npy')
    X_train = np.expand_dims(X_train, axis=-1)
    X_train = np.concatenate((X_train,X_train,X_train),axis=-1)
    Y_train = np.load('E:/1dto2d/1dto2d_data/train_datay.npy')
    Y_train = tf.keras.utils.to_categorical(Y_train)

    if os.path.exists(model_path):
        # 导入训练好的模型
        model = tf.keras.models.load_model(filepath=model_path)
        #model.summary()
    else:
        # 构建CNN模型
        model = builtModel()
        model.compile(optimizer=adam_v2.Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False),
                      loss='categorical_crossentropy',
                      metrics=['accuracy'])
        #model.summary()
        # 训练与验证
        model.fit(X_train, Y_train, epochs=10,
                  batch_size=10,
                  validation_split=0.1)

    # 预测
    X_test = np.load('E:/1dto2d/1dto2d_data/X_test.npy')
    X_test = np.expand_dims(X_test, axis=-1)
    X_test = np.concatenate((X_test,X_test,X_test),axis=-1)
    Y_test = np.load('E:/1dto2d/1dto2d_data/Y_test.npy')
    Y_pred = np.argmax(model.predict(X_test), axis=1)
    # 绘制混淆矩阵
    plotHeatMap(Y_test, Y_pred)
if __name__ == '__main__':
    main()

Here I paste all the code together. Of course, it is best to use jupyter notebook to implement it if possible.

Summarize

Problem 1: The amount of data is too large

The number of training sets in my code is 1,000, which is much smaller than the original data volume of more than 60,000. The test set has 250 images and the original data volume is more than 30,000. If all the data is converted into two-dimensional images, it will require about 60GB of memory. A normal notebook is unlikely to be able to meet this condition, especially during the training process, data needs to be read in. Of course, this can be solved, but it will require major changes to the code, which will take a very long time.

Problem 2: The amount of training parameters is too large

Because I was learning to use pre-trained models and mature model structures, I did not use the model I built in the previous blog. However, the problem that comes with it is that there are a lot of parameters for training, about 23 million. For the classification task of ECG signals, especially the MIT-BIH data set, such a parameter amount is obviously unreasonable. According to my understanding during my graduation thesis, very good results can be obtained with about hundreds of thousands of parameters.

Training results display

The following is the training process for ten epochs:

Epoch 1/10
900/900 [==============================] - 163s 181ms/sample - loss: 2.7384 - accuracy: 0.3611 - val_loss: 1933666.1375 - val_accuracy: 0.3100
Epoch 2/10
900/900 [==============================] - 149s 165ms/sample - loss: 1.2458 - accuracy: 0.4456 - val_loss: 911.5526 - val_accuracy: 0.3100
Epoch 3/10
900/900 [==============================] - 140s 155ms/sample - loss: 1.1677 - accuracy: 0.4789 - val_loss: 104.8433 - val_accuracy: 0.2700
Epoch 4/10
900/900 [==============================] - 139s 155ms/sample - loss: 1.1441 - accuracy: 0.5222 - val_loss: 14.3243 - val_accuracy: 0.2700
Epoch 5/10
900/900 [==============================] - 139s 154ms/sample - loss: 1.0364 - accuracy: 0.5711 - val_loss: 1.3777 - val_accuracy: 0.4300
Epoch 6/10
900/900 [==============================] - 141s 157ms/sample - loss: 0.9942 - accuracy: 0.5800 - val_loss: 2.5097 - val_accuracy: 0.2300
Epoch 7/10
900/900 [==============================] - 141s 157ms/sample - loss: 1.0099 - accuracy: 0.5833 - val_loss: 118.1422 - val_accuracy: 0.3100
Epoch 8/10
900/900 [==============================] - 141s 156ms/sample - loss: 0.9741 - accuracy: 0.6122 - val_loss: 1.0247 - val_accuracy: 0.6200
Epoch 9/10
900/900 [==============================] - 141s 156ms/sample - loss: 0.9349 - accuracy: 0.6233 - val_loss: 3.9687 - val_accuracy: 0.3500
Epoch 10/10
900/900 [==============================] - 141s 156ms/sample - loss: 0.8174 - accuracy: 0.6544 - val_loss: 1.6731 - val_accuracy: 0.5500

The model trained in the 8th round is the best among them, but such results obviously cannot be applied in practice. However, such a result can be achieved in ten rounds when the training sample is less than one-sixtieth of the original sample, and the accuracy of the model still has a lot of room for improvement, which is enough to prove that this method is feasible. .

Outlook

In fact, I initially wanted to find a deep learning method to realize the two-dimensionalization of one-dimensional signals, similar to the training method of autoencoders. The images obtained in this way should probably have a much smaller amount of data (generated by Markov transition fields) The size of a single sample has increased 300 times). But I searched and found no similar literature or blogs, so I could only choose this method of compromise and compromise again.

Guess you like

Origin blog.csdn.net/qq_44725872/article/details/126807700