pytorch implements LSTM (with code)

Recently I made a NASA PCoE IGBT accelerated aging data. I wanted to write an LSTM model based on the pytorch framework to predict the IGBT degradation state, so I have this article.

Note: I won’t talk about the principle of LSTM. There are a lot of things on the Internet. If you don’t understand, go to Baidu. This article focuses on code implementation.

1. Introduction to the data set

This data set is the IGBT accelerated aging data set published by NASA PCoE Research Center. The data set contains IGBT accelerated aging data under four experimental conditions. The following are the experimental conditions:

(1) SMU data for new devices

The file contains a set of electrical characteristic data under the original experimental conditions, namely 20 sets of IGBTs with model IRG4BC30K and 20 sets of MOSFETs with model IRG520Npbf. The parameters measured in the experiment are leakage voltage, breakdown voltage, and threshold voltage. , The data set also contains the turn-on and turn-off data of IGBT.

(2) Accelerated thermal aging experiment under DC gate voltage (Thermal Overstress Aging with DC at gate)

This file contains the IGBT with the device model IRF-G4BC30KD when high voltage is applied to the gate, the infrared sensor measures the temperature until the package temperature exceeds the limit, and then the gate power is turned off. Experiments show that the cause of device failure is the latch-up effect (parasitic PNPN intertransistor is turned on), and the collector current is monitored during the failure process.

The result proves that the change of the collector current is the cause of the device failure. The experimental measurement data is taken from an independent IGBT module. The measurement parameters in the experiment are collector current, collector voltage, gate voltage, and packaging temperature. The experimental parameters are set as follows:

(3) Thermal Overstress Aging with Square Signal at gate (Thermal Overstress Aging with Square Signal at gate)

This file contains the experimental data of accelerated thermal aging of IGBT under square wave voltage applied to the grid. By applying a square wave signal with a frequency of 1kHz, a duty cycle of 40%, and an amplitude of 0-8V to the gate of the power device IRG4BC30K, the package temperature is controlled at 260°C-270°C, and the device is continuously over-current. High temperature aging experiment.

Among them are PWM (pwm temp controller state) working conditions, low-speed (steady state) measurement data and high-resolution data under the transient characteristics of switching devices. A square wave signal is applied to the gate to perform a thermal cycling experiment, and transient data are collected at the time of device switching. The purpose of this experiment is to use the device as a switch when the conditions are far beyond the device's working safety zone.

The main parameters measured in the experiment are collector current, collector voltage, gate voltage, and packaging temperature. The experimental data package records the collector current, collector voltage, gate voltage, gate current, packaging temperature, radiator temperature, data recording time and other parameters when the device is working.

(4) Thermal Overstress Aging with Square Signal at gate and SMU data under the applied square wave voltage on the grid and under the source measurement unit (Thermal Overstress Aging with Square Signal at gate and SMU data)

This file includes applying a square wave signal to the gate for thermal cycling, and collecting transient data at the time the device is switched on and off.

The aging data contains three parameters: threshold voltage, breakdown voltage and leakage current.

The source measurement unit (SMU) parameter characterization data contains the data of four devices, and the packaging temperature is controlled within a certain range. Because the switching speed of the device is difficult to control, some transient data (collector current, gate voltage, collector-emitter voltage under high-speed measurement) are not collected at the rise and fall of the output waveform. The data acquisition system lacks correction points. Therefore, there is no way to measure low-speed (filtered) data.

The main parameters measured in the experiment are collector current, collector voltage, gate voltage, and packaging temperature. During the experiment, due to several problems experienced during the aging of the equipment, some transient data were lost, resulting in a drift of 600mA in the collector current data, and the steady-state data was not very accurate.

The blogger selects the IGBT collector-emitter turn-off peak voltage value (as shown in the figure below) as the characteristic parameter and uses this data set to conduct prediction algorithm research.

2. Data preprocessing

In order to improve the prediction accuracy, the data is first eliminated by outliers, data smoothing and standardization.

Use the quadratic exponential smoothing algorithm to smooth the data:

Use the formula shown in the figure below to standardize:

Since the spike voltage values ​​are discrete data values, it is necessary to construct autocorrelation time series data samples for the input LSTM network.

The sliding time window method is used to construct data samples for network training, and the window size is set to 5. Such as:

Use [x_{0},x_{1},x_{2},x_{3},x_{4}]as a sample, x_{5}as a sample label, and so on.

Three, python code implementation

Under the condition that the input sample parameters and learning rate remain unchanged, that is, input_size = 5, out_oput = 1, lr = 0.01, I tried many parameters, and found that in the training set 80%, the test set 20%, hidden_size = 20, add one The best results are when the layer is fully connected.

import scipy.io as sio
import numpy as np
import torch
from torch import nn
import matplotlib.pyplot as plt
import matplotlib
import pandas as pd
from torch.autograd import Variable
import math
import csv

# Define LSTM Neural Networks
class LstmRNN(nn.Module):
    """
        Parameters:
        - input_size: feature size
        - hidden_size: number of hidden units
        - output_size: number of output
        - num_layers: layers of LSTM to stack
    """

    def __init__(self, input_size, hidden_size=1, output_size=1, num_layers=1):
        super().__init__()

        self.lstm = nn.LSTM(input_size, hidden_size, num_layers)  # utilize the LSTM model in torch.nn
        self.linear1 = nn.Linear(hidden_size, output_size) # 全连接层

    def forward(self, _x):
        x, _ = self.lstm(_x)  # _x is input, size (seq_len, batch, input_size)
        s, b, h = x.shape  # x is output, size (seq_len, batch, hidden_size)
        x = x.view(s * b, h)
        x = self.linear1(x)
        x = x.view(s, b, -1)
        return x

if __name__ == '__main__':

# checking if GPU is available
    device = torch.device("cpu")

    if (torch.cuda.is_available()):
        device = torch.device("cuda:0")
        print('Training on GPU.')
    else:
        print('No GPU available, training on CPU.')

    # 数据读取&类型转换
    data_x = np.array(pd.read_csv('Data_x.csv', header=None)).astype('float32')
    data_y = np.array(pd.read_csv('Data_y.csv', header=None)).astype('float32')

    # 数据集分割
    data_len = len(data_x)
    t = np.linspace(0, data_len, data_len)

    train_data_ratio = 0.8  # Choose 80% of the data for training
    train_data_len = int(data_len * train_data_ratio)

    train_x = data_x[5:train_data_len]
    train_y = data_y[5:train_data_len]
    t_for_training = t[5:train_data_len]

    test_x = data_x[train_data_len:]
    test_y = data_y[train_data_len:]
    t_for_testing = t[train_data_len:]

    # ----------------- train -------------------
    INPUT_FEATURES_NUM = 5
    OUTPUT_FEATURES_NUM = 1
    train_x_tensor = train_x.reshape(-1, 1, INPUT_FEATURES_NUM)  # set batch size to 1
    train_y_tensor = train_y.reshape(-1, 1, OUTPUT_FEATURES_NUM)  # set batch size to 1

    # transfer data to pytorch tensor
    train_x_tensor = torch.from_numpy(train_x_tensor)
    train_y_tensor = torch.from_numpy(train_y_tensor)

    lstm_model = LstmRNN(INPUT_FEATURES_NUM, 20, output_size=OUTPUT_FEATURES_NUM, num_layers=1)  # 20 hidden units
    print('LSTM model:', lstm_model)
    print('model.parameters:', lstm_model.parameters)
    print('train x tensor dimension:', Variable(train_x_tensor).size())

    criterion = nn.MSELoss()
    optimizer = torch.optim.Adam(lstm_model.parameters(), lr=1e-2)

    prev_loss = 1000
    max_epochs = 2000

    train_x_tensor = train_x_tensor.to(device)

    for epoch in range(max_epochs):
        output = lstm_model(train_x_tensor).to(device)
        loss = criterion(output, train_y_tensor)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if loss < prev_loss:
            torch.save(lstm_model.state_dict(), 'lstm_model.pt')  # save model parameters to files
            prev_loss = loss

        if loss.item() < 1e-4:
            print('Epoch [{}/{}], Loss: {:.5f}'.format(epoch + 1, max_epochs, loss.item()))
            print("The loss value is reached")
            break
        elif (epoch + 1) % 100 == 0:
            print('Epoch: [{}/{}], Loss:{:.5f}'.format(epoch + 1, max_epochs, loss.item()))

    # prediction on training dataset
    pred_y_for_train = lstm_model(train_x_tensor).to(device)
    pred_y_for_train = pred_y_for_train.view(-1, OUTPUT_FEATURES_NUM).data.numpy()

    # ----------------- test -------------------
    lstm_model = lstm_model.eval()  # switch to testing model

    # prediction on test dataset
    test_x_tensor = test_x.reshape(-1, 1,
                                   INPUT_FEATURES_NUM)
    test_x_tensor = torch.from_numpy(test_x_tensor)  # 变为tensor
    test_x_tensor = test_x_tensor.to(device)

    pred_y_for_test = lstm_model(test_x_tensor).to(device)
    pred_y_for_test = pred_y_for_test.view(-1, OUTPUT_FEATURES_NUM).data.numpy()

    loss = criterion(torch.from_numpy(pred_y_for_test), torch.from_numpy(test_y))
    print("test loss:", loss.item())

    # ----------------- plot -------------------
    plt.figure()
    plt.plot(t_for_training, train_y, 'b', label='y_trn')
    plt.plot(t_for_training, pred_y_for_train, 'y--', label='pre_trn')

    plt.plot(t_for_testing, test_y, 'k', label='y_tst')
    plt.plot(t_for_testing, pred_y_for_test, 'm--', label='pre_tst')

    plt.xlabel('t')
    plt.ylabel('Vce')
    plt.show()

The result is shown in the figure below:

The blue line represents the true value of the training set, and the yellow line represents the predicted value of the training set

The black line represents the true value of the test set, and the purple line represents the predicted value of the test set

test_loss=0.004276850726

 

reference:

https://zhuanlan.zhihu.com/p/104475016

Research on IGBT failure prediction based on deep learning-Han Henggui-Master's Thesis of Beijing Jiaotong University

 

Guess you like

Origin blog.csdn.net/ting_qifengl/article/details/113039454