Artificial intelligence (pytorch) building model 16-Application of hypertension prediction based on LSTM+CNN model

Hello everyone, I am Weixue AI. Today I will introduce artificial intelligence (pytorch) to build models 16-the application of high blood pressure prediction based on LSTM+CNN model, LSTM+CNN model building and training, this project will use pytorch to build LSTM +CNN model, involved in the project: high blood pressure prediction, high blood pressure is a common disease, early prediction and intervention are crucial to prevent it from developing into a serious disease.
insert image description here

Table of contents

  1. Background of the project
  2. LSTM-CNN model principle
  3. data sample
  4. data loading
  5. model building
  6. model training
  7. model prediction
  8. Summarize

1. Project Background

Hypertension is an urgent global public health challenge, considered one of the highest burdens of disease prevention worldwide and a major risk factor for cardiovascular disease. Timely and regular monitoring of blood pressure is essential for early diagnosis and prevention of cardiovascular diseases. The body's blood pressure often fluctuates over time and is affected by a variety of factors, such as stress, emotions, diet, exercise, and drug use. Therefore, continuous monitoring rather than just monitoring blood pressure at specific time points is of great significance for early detection and treatment of hypertension. This project uses the LSTM-CNN model in deep learning to predict high blood pressure by learning the historical health data of patients.

2. LSTM-CNN model principle

The LSTM-CNN model is a hybrid model that combines the advantages of long short-term memory networks (LSTM) and convolutional neural networks (CNN). LSTM can process time series data and learn long-term dependencies; while CNN can extract useful information from local features. In hypertension prediction, LSTM is used to learn time-dependent relationships in historical health data of patients, while CNN is used to extract useful features from these data.

The LSTM-CNN model is a hybrid model that combines Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). Its mathematical principle can be expressed in the following way:

First, we define the input sequence as X = { x 1 , x 2 , . . . , x T } \mathbf{X} = \{x_1, x_2, ..., x_T\}X={ x1,x2,...,xT} , whereTTT is the length of the sequence. In LSTM, the hidden state of each time step is composed of memory unit (cell)ct \mathbf{c}_tctand the output state ht \mathbf{h}_thtcomposition.

The calculation process of the LSTM layer is as follows:

  1. Input Gate: By calculating the input gate vector it \mathbf{i}_titto control the influence of the input at the current time step. It is calculated as follows:
    it = σ ( W ixt + U iht − 1 + bi ) \mathbf{i}_t = \sigma(\mathbf{W}_i\mathbf{x}_t + \mathbf{U}_i\mathbf{ h}_{t-1} + \mathbf{b}_i)it=s ( Wixt+Uiht1+bi)
    , whereW i \mathbf{W}_iWiU i \mathbf{U}_iUi b i \mathbf{b}_i biis a learnable parameter, σ \sigmaσ is the sigmoid function.

  2. Forget Gate: By calculating the forget gate vector ft \mathbf{f}_tftTo control the degree of retention of previous memories. It is calculated as follows:
    ft = σ ( W fxt + U fht − 1 + bf ) \mathbf{f}_t = \sigma(\mathbf{W}_f\mathbf{x}_t + \mathbf{U}_f\mathbf{ h}_{t-1} + \mathbf{b}_f)ft=s ( Wfxt+Ufht1+bf)
    among them,W f \mathbf{W}_fWfU f \mathbf{U}_fUf b f \mathbf{b}_f bfis a learnable parameter.

  3. Memory Update: By calculating the new memory unit ct \mathbf{c}_tctDefine the current solution:
    ct = ft ⊙ ct − 1 + it ⊙ tanh ⁡ ( W cxt + U cht − 1 + bc ) \mathbf{c}_t = \mathbf{f}_t \odot \mathbf{c }_{t-1} + \mathbf{i}_t \odot \tanh(\mathbf{W}_c\mathbf{x}_t + \mathbf{U}_c\mathbf{h}_{t-1} + \mathbf{b}_c)ct=ftct1+itfishy ( Wcxt+Ucht1+bc)
    among them,W c \mathbf{W}_cWcU c \mathbf{U}_cUcbc \mathbf{b}_cbcis a learnable parameter, ⊙ \odot represents element-wise multiplication.

  4. Output Gate: By calculating the output gate vector ot \mathbf{o}_totto control the influence of the current time step output. It is calculated as follows:
    ot = σ ( W oxt + U oht − 1 + bo ) \mathbf{o}_t = \sigma(\mathbf{W}_o\mathbf{x}_t + \mathbf{U}_o\mathbf{ h}_{t-1} + \mathbf{b}_o)ot=s ( Woxt+Uoht1+bo)
    Among them,W o \mathbf{W}_oWoU o \mathbf{U}_oUobo \mathbf{b}_obois a learnable parameter.

Finally, the output state of LSTM ht \mathbf{h}_thtand memory unit ct \mathbf{c}_tctcan be calculated by:
ht = ot ⊙ tanh ⁡ ( ct ) \mathbf{h}_t = \mathbf{o}_t \odot \tanh(\mathbf{c}_t)ht=otfishy ( ct)

Next, the output state of the LSTM layer is used as the input of the CNN to perform operations such as convolution and pooling, and then the final prediction or classification is performed through the fully connected layer.

The process of inputting LSTM layer results into CNN:

When the output state of the LSTM layer is used as the input of CNN, the output state is usually reshaped (reshape) to adapt to the input format requirements of CNN. The specific processing method is as follows:

  1. First, assume that the output state shape of the LSTM layer is ( B , T , H ) (B, T, H)(B,T,H ) , whereBBB represents the batch size (batch size),TTT represents the sequence length,HHH represents the hidden state dimension.

  2. Next, reshape the output state so that its shape becomes ( B × T , H ) (B \times T, H)(B×T,H ) . This step can connect all the time steps output by LSTM to get a two-dimensional matrix, where each row represents the hidden state of a time step.

  3. Then, pass the reshaped output state as input to the CNN model.

  4. In CNN models, convolutional layers are usually used for feature extraction. The convolution layer extracts local features by defining parameters such as the number, size, and step size of the convolution kernel.

  5. Next, a common operation is to downsample the output of the convolutional layer using a pooling layer to reduce the dimensionality and number of features. Pooling can be achieved by taking the maximum value (max pooling) or calculating the average value (average pooling).

  6. Finally, after the pooling layer, the obtained feature vector can be input to the fully connected layer for final prediction or classification tasks.
    insert image description here

3. Data sample

The following are some Chinese time-series hypertension csv data samples:

id,年龄,性别,体重,身高,收缩压,舒张压,心率,血糖,血脂,是否高血压
1,45,男,75,175,120,80,70,5.6,1.2,否
2,50,男,80,180,130,85,72,6.0,1.3,是
3,55,女,65,165,110,70,68,52,1.1,否
4,35,女,60,160,110,70,75,4.8,1.0,否
5,42,男,78,173,125,82,68,5.2,1.1,是
6,58,男,85,177,140,90,80,6.5,1.4,是
7,47,女,62,165,115,75,72,5.3,1.2,否
8,52,男,79,179,128,84,70,5.9,1.3,是
9,43,女,66,162,112,73,70,5.5,1.1,否
10,50,男,83,176,125,82,75,6.2,1.2,是
11,37,女,64,163,110,70,68,5.0,1.0,否
12,49,男,76,178,130,85,72,5.8,1.2,是
13,57,男,88,183,145,92,80,6.8,1.5,是
14,41,女,63,164,112,73,70,5.4,1.1,否
15,55,男,82,175,127,83,75,6.1,1.3,是
16,38,女,61,158,108,68,67,4.9,1.0,否
17,53,男,80,181,132,87,74,6.0,1.4,是
18,46,女,67,167,114,75,72,5.1,1.2,否
19,48,男,77,180,128,84,70,5.7,1.2,是
20,60,男,90,185,150,95,78,7.0,1.6,是
21,39,女,59,156,106,66,65,4.7,0.9,否
22,54,男,81,178,130,85,72,6.0,1.3,是
23,44,女,68,168,115,76,73,5.2,1.1,否
...

4. Data loading

We use the pandas library to load the csv data:

import pandas as pd

# 加载数据
data = pd.read_csv('hypertension.csv')

# 数据预处理
data['性别'] = data['性别'].map({
    
    '男': 0, '女': 1})
data['是否高血压'] = data['是否高血压'].map({
    
    '否': 0, '是': 1})

# 分割训练集和测试集
train_data = data.sample(frac=0.8, random_state=0)
test_data = data.drop(train_data.index)

5. Model building

We use PyTorch to build the LSTM-CNN model:

import torch
import torch.nn as nn

class LSTM_CNN(nn.Module):
    def __init__(self):
        super(LSTM_CNN, self).__init__()
        self.lstm = nn.LSTM(input_size=10, hidden_size=64, num_layers=2, batch_first=True)
        self.conv1 = nn.Conv1d(in_channels=64, out_channels=128, kernel_size=1)
        self.fc = nn.Linear(128, 2)

    def forward(self, x):
        x,_ = self.lstm(x)
        x = x.transpose(1, 0)
        x = self.conv1(x)
        x = x.transpose(1, 0)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

6. Model training

We use Adam optimizer and cross-entropy loss function for model training:

# 模型训练
model = LSTM_CNN()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

for epoch in range(100):
    inputs = torch.tensor(train_data.drop('是否高血压', axis=1).values).float()
    labels = torch.tensor(train_data['是否高血压'].values).long()
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    print('Epoch [%d/100], Loss: %.4f' %(epoch+1, loss.item()))

operation result:

Epoch [72/100], Loss: 0.0001
Epoch [73/100], Loss: 0.0001
Epoch [74/100], Loss: 0.0001
Epoch [75/100], Loss: 0.0001
Epoch [76/100], Loss: 0.0001
Epoch [77/100], Loss: 0.0001
Epoch [78/100], Loss: 0.0001
Epoch [79/100], Loss: 0.0001
Epoch [80/100], Loss: 0.0001
Epoch [81/100], Loss: 0.0001
Epoch [82/100], Loss: 0.0001
Epoch [83/100], Loss: 0.0001
Epoch [84/100], Loss: 0.0001
...

7. Model Prediction

After training is complete, we can feed in data to make predictions:

# 模型预测
inputs = torch.tensor(test_data.drop('是否高血压', axis=1).values).float()
outputs = model(inputs)
_, predicted = torch.max(outputs.data, 1)
print('Predicted: ', predicted)

operation result:

[[  1.   45.    0.   75.  175.  120.   80.   70.    5.6   1.2]
 [  4.   35.    1.   60.  160.  110.   70.   75.    4.8   1. ]
 [ 13.   57.    0.   88.  183.  145.   92.   80.    6.8   1.5]
 [ 16.   38.    1.   61.  158.  108.   68.   67.    4.9   1. ]]
Predicted:  tensor([0, 0, 1, 0])

8. Summary

Based on the LSTM-CNN model, this project realizes the prediction of hypertension by learning the historical health data of patients. This method has high prediction accuracy and is of great significance for the early prediction and intervention of hypertension.

Several main application areas of the LSTM-CNN model:

Text classification: The LSTM-CNN model can be used for text classification tasks, such as sentiment analysis, spam filtering, news classification, etc. The LSTM layer can capture the long-distance dependencies in the sequence, while the CNN layer can extract local features, and the combination of the two can better represent text information.

Named Entity Recognition (NER): LSTM-CNN models perform well in NER tasks. The LSTM layer can learn contextual information and dependencies between entities, while the CNN layer can capture the local features of entities, thereby improving the recognition accuracy of named entities.

Machine translation: LSTM-CNN models can be applied to machine translation tasks, converting text from one language to another. The LSTM layer can handle the long-distance dependencies between the input sequence and the output sequence, while the CNN layer can extract local features, which help to improve the translation quality.

Sentence Generation: The LSTM-CNN model can be used to generate natural language texts such as sentences, paragraphs or dialogues. By training the model, combined with the generation ability of LSTM and the ability to extract features of CNN, text with contextual coherence and grammatical correctness can be generated.

Guess you like

Origin blog.csdn.net/weixin_42878111/article/details/131498358