Time Domain Convolutional Network TCN --- From "Abaaba" to "Balabala"

TCN From "Abaaaba" to "Balabala"

  • The concept of TCN (why come! What problems can be solved)
  • Parents of TCN (Origin)
  • Introduction to the principle of TCN
  • On the code!

1. What does TCN (time-domain convolutional network, temporal convolutional network) do and what it can do

  • Main application directions:

Time series forecast, probability forecast, time forecast, traffic forecast

2. The origin of TCN

ps: Before understanding TCN, you need to have a certain understanding of CNN and RNN.

  • solving issues:

It is a network structure capable of processing time series data. Under certain conditions, the effect is better than traditional neural networks (RNN, CNN, etc.).

3. Introduction to the principle of TCN

TCN network structure

Please add a picture description

1. The network structure of TCN is mainly composed of the above figure. This article is divided into two parts, the left and the right, the first is the left

Dilated Causal Conv ---> WeightNorm--->ReLU--->Dropout--->Dilated Causal Conv ---> WeightNorm--->ReLU--->Dropout

Obviously this can be divided into

(Dilated Causal Conv ---> WeightNorm--->ReLU--->Dropout)*2

ok, let's explain these four one by one, if you know more, you can choose to skip

1、Dilated Gausal Conv

Chinese name: Expansion Causal Convolution

Dilated causal convolution can be divided into three parts : dilation , causality , and convolution .

Convolution refers to the convolution in CNN, which refers to a sliding operation performed by the convolution kernel on the data;

Expansion refers to the interval sampling of the input that allows convolution , which is similar to the stride in the convolutional neural network, but there are also obvious differences

illustrate:

Please add a picture description

Causality refers to the data at time t in the i-th layer, which only depends on the influence of the (i-1) layer at time t and its previous values. Causal convolution can discard the reading of future data during training, and is a strict time-constrained model.

illustrate:

Please add a picture description

​ (ps: no expansion convolution is added)

2、WeightNorm

weight normalization

Normalize the weight value. If you want to study the normalization process & normalization formula carefully, you can click the link to learn

click

advantage:

1. The time overhead is small and the operation speed is fast!

2. Introduce less noise

3. WeightNorm is accelerated by rewriting the weight of the deep network, without introducing dependence on minibatch

3、ReLU()

A kind of activation function

advantage:

1. It can make the training speed of the network faster

2. Increase the nonlinearity of the network and improve the expressive ability of the model

3. Prevent the gradient from disappearing,

4. Make the network sparse, etc.

official:

insert image description here

Overview diagram:

ReLU function

4、Dropout()

Dropout means that during the training process of the deep learning network, for the neural network unit, it is temporarily discarded from the network according to a certain probability.

Advantages: prevent overfitting and improve the calculation speed of the model

Second, the last is the right side - -residual connection :

On the right is a 1*1 convolution block, which not only enables the network to have the function of transmitting information across layers, but also ensures the consistency of input and output.

3. Advantages of TCN:

1. Parallelism

2. To a large extent, gradient disappearance and gradient explosion can be avoided

3. The larger the receptive field, the more information learned

4. From zero coding

import os
import sys
import paddle
import paddle.nn as nn
import numpy as np
import pandas as pd
import seaborn as sns
from pylab import rcParams
import matplotlib.pyplot as plt
from matplotlib import rc
import paddle.nn.functional as F
from paddle.nn.utils import weight_norm
from sklearn.preprocessing import MinMaxScaler
from pandas.plotting import register_matplotlib_converters
from sourceCode import TimeSeriesNetwork
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "../..")))

class Chomp1d(nn.Layer):
    def __init__(self, chomp_size):
        super(Chomp1d, self).__init__()
        self.chomp_size = chomp_size

    def forward(self, x):
        return x[:, :, :-self.chomp_size]


class TemporalBlock(nn.Layer):
    def __init__(self,
                 n_inputs,
                 n_outputs,
                 kernel_size,
                 stride,
                 dilation,
                 padding,
                 dropout=0.2):
        super(TemporalBlock, self).__init__()
        self.conv1 = weight_norm(
            nn.Conv1D(
                n_inputs,
                n_outputs,
                kernel_size,
                stride=stride,
                padding=padding,
                dilation=dilation))
        # Chomp1d is used to make sure the network is causal.
        # We pad by (k-1)*d on the two sides of the input for convolution,
        # and then use Chomp1d to remove the (k-1)*d output elements on the right.
        self.chomp1 = Chomp1d(padding)
        self.relu1 = nn.ReLU()
        self.dropout1 = nn.Dropout(dropout)

        self.conv2 = weight_norm(
            nn.Conv1D(
                n_outputs,
                n_outputs,
                kernel_size,
                stride=stride,
                padding=padding,
                dilation=dilation))
        self.chomp2 = Chomp1d(padding)
        self.relu2 = nn.ReLU()
        self.dropout2 = nn.Dropout(dropout)

        self.net = nn.Sequential(self.conv1, self.chomp1, self.relu1,
                                 self.dropout1, self.conv2, self.chomp2,
                                 self.relu2, self.dropout2)
        self.downsample = nn.Conv1D(n_inputs, n_outputs,
                                    1) if n_inputs != n_outputs else None
        self.relu = nn.ReLU()
        self.init_weights()

    def init_weights(self):
        self.conv1.weight.set_value(
            paddle.tensor.normal(0.0, 0.01, self.conv1.weight.shape))
        self.conv2.weight.set_value(
            paddle.tensor.normal(0.0, 0.01, self.conv2.weight.shape))
        if self.downsample is not None:
            self.downsample.weight.set_value(
                paddle.tensor.normal(0.0, 0.01, self.downsample.weight.shape))

    def forward(self, x):
        out = self.net(x)
        res = x if self.downsample is None else self.downsample(x) # 让输入等于输出
        return self.relu(out + res)


class TCNEncoder(nn.Layer):
    def __init__(self, input_size, num_channels, kernel_size=2, dropout=0.2):
        # input_size : 输入的预期特征数
        # num_channels: 通道数
        # kernel_size: 卷积核大小
        super(TCNEncoder, self).__init__()
        self._input_size = input_size
        self._output_dim = num_channels[-1]

        layers = nn.LayerList()
        num_levels = len(num_channels)
        # print('print num_channels: ', num_channels)
        # print('print num_levels: ',num_levels)
        # exit(0)
        for i in range(num_levels):
            dilation_size = 2 ** i
            in_channels = input_size if i == 0 else num_channels[i - 1]
            out_channels = num_channels[i]
            layers.append(
                TemporalBlock(
                    in_channels,
                    out_channels,
                    kernel_size,
                    stride=1,
                    dilation=dilation_size,
                    padding=(kernel_size - 1) * dilation_size,
                    dropout=dropout))

        self.network = nn.Sequential(*layers)

    def get_input_dim(self):
        return self._input_size



    def get_output_dim(self):
        return self._output_dim

    def forward(self, inputs):
        inputs_t = inputs.transpose([0, 2, 1])
        output = self.network(inputs_t).transpose([2, 0, 1])[-1]
        return output


class TimeSeriesNetwork(nn.Layer):

    def __init__(self, input_size, next_k=1, num_channels=[256]):
        super(TimeSeriesNetwork, self).__init__()

        self.last_num_channel = num_channels[-1]

        self.tcn = TCNEncoder(
            input_size=input_size,
            num_channels=num_channels,
            kernel_size=3,
            dropout=0.2
        )

        self.linear = nn.Linear(in_features=self.last_num_channel, out_features=next_k)

    def forward(self, x):
        tcn_out = self.tcn(x)
        y_pred = self.linear(tcn_out)
        return y_pred
'''
我努力把自己塑造成悲剧里面的男主角,
把一切过错推到你的身上,
让你成为万恶的巫婆,
丧心病狂
可是我就是一个正常的人,
有悲有喜,
有错有对,
走到今天这个地步,
我们都有责任,
直到现在我还没有觉得我失去了你
你告诉我,我失去你了么?
'''
def config_mtp():
    sns.set(style='whitegrid', palette='muted', font_scale=1.2)
    HAPPY_COLORS_PALETTE = ["#01BEFE", "#FFDD00", "#FF7D00", "#FF006D", "#93D30C", "#8F00FF"]
    sns.set_palette(sns.color_palette(HAPPY_COLORS_PALETTE))
    rcParams['figure.figsize'] = 14, 10
    register_matplotlib_converters()

def read_data():
    df_all = pd.read_csv('./data/time_series_covid19_confirmed_global.csv')
    # print(df_all.head())

    # 我们将对全世界的病例数进行预测,因此我们不需要关心具体国家的经纬度等信息,只需关注具体日期下的全球病例数即可。

    df = df_all.iloc[:, 4:]
    daily_cases = df.sum(axis=0)
    daily_cases.index = pd.to_datetime(daily_cases.index)
    # print(daily_cases.head())

    plt.figure(figsize=(5, 5))
    plt.plot(daily_cases)
    plt.title("Cumulative daily cases")
    # plt.show()

    # 为了提高样本时间序列的平稳性,继续取一阶差分
    daily_cases = daily_cases.diff().fillna(daily_cases[0]).astype(np.int64)
    # print(daily_cases.head())

    plt.figure(figsize=(5, 5))
    plt.plot(daily_cases)
    plt.title("Daily cases")
    plt.xticks(rotation=60)
    plt.show()
    return daily_cases

def create_sequences(data, seq_length):
    xs = []
    ys = []
    for i in range(len(data) - seq_length + 1):
        x = data[i:i + seq_length - 1]
        y = data[i + seq_length - 1]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)

def preprocess_data(daily_cases):
    TEST_DATA_SIZE,SEQ_LEN = 30,10
    TEST_DATA_SIZE = int(TEST_DATA_SIZE/100*len(daily_cases))
    # TEST_DATA_SIZE=30,最后30个数据当成测试集,进行预测
    train_data = daily_cases[:-TEST_DATA_SIZE]
    test_data = daily_cases[-TEST_DATA_SIZE:]
    print("The number of the samples in train set is : %i" % train_data.shape[0])
    print(train_data.shape, test_data.shape)

    # 为了提升模型收敛速度与性能,我们使用scikit-learn进行数据归一化。
    scaler = MinMaxScaler()
    train_data = scaler.fit_transform(np.expand_dims(train_data, axis=1)).astype('float32')
    test_data = scaler.transform(np.expand_dims(test_data, axis=1)).astype('float32')

    # 搭建时间序列
    # 可以用前10天的病例数预测当天的病例数,为了让测试集中的所有数据都能参与预测,我们将向测试集补充少量数据,这部分数据只会作为模型的输入。
    x_train, y_train = create_sequences(train_data, SEQ_LEN)
    test_data = np.concatenate((train_data[-SEQ_LEN + 1:], test_data), axis=0)
    x_test, y_test = create_sequences(test_data, SEQ_LEN)

    # 尝试输出
    '''
    print("The shape of x_train is: %s"%str(x_train.shape))
    print("The shape of y_train is: %s"%str(y_train.shape))
    print("The shape of x_test is: %s"%str(x_test.shape))
    print("The shape of y_test is: %s"%str(y_test.shape))
    '''
    return x_train,y_train,x_test,y_test,scaler

# 数据集处理完毕,将数据集封装到CovidDataset,以便模型训练、预测时调用。
class CovidDataset(paddle.io.Dataset):
    def __init__(self, feature, label):
        self.feature = feature
        self.label = label
        super(CovidDataset, self).__init__()

    def __len__(self):
        return len(self.label)

    def __getitem__(self, index):
        return [self.feature[index], self.label[index]]

def parameter():
    LR = 1e-2

    model = paddle.Model(network)

    optimizer = paddle.optimizer.Adam(
        learning_rate=LR, parameters=model.parameters())

    loss = paddle.nn.MSELoss(reduction='sum')
    model.prepare(optimizer, loss)



config_mtp()
data = read_data()
x_train,y_train,x_test,y_test,scaler = preprocess_data(data)
train_dataset = CovidDataset(x_train, y_train)
test_dataset = CovidDataset(x_test, y_test)
network = TimeSeriesNetwork(input_size=1)

# 参数配置
LR = 1e-2

model = paddle.Model(network)

optimizer = paddle.optimizer.Adam(learning_rate=LR, parameters=model.parameters()) # 优化器

loss = paddle.nn.MSELoss(reduction='sum')
model.prepare(optimizer, loss) # Configures the model before runing,运行前配置模型

# 训练
USE_GPU = False
TRAIN_EPOCH = 100
LOG_FREQ = 20
SAVE_DIR = os.path.join(os.getcwd(),"save_dir")
SAVE_FREQ = 20

if USE_GPU:
    paddle.set_device("gpu")
else:
    paddle.set_device("cpu")

model.fit(train_dataset,
    batch_size=32,
    drop_last=True,
    epochs=TRAIN_EPOCH,
    log_freq=LOG_FREQ,
    save_dir=SAVE_DIR,
    save_freq=SAVE_FREQ,
    verbose=1 # The verbosity mode, should be 0, 1, or 2.   0 = silent, 1 = progress bar, 2 = one line per epoch. Default: 2.
    )




# 预测
preds = model.predict(
        test_data=test_dataset
        )

# 数据后处理,将归一化的数据转化为原数据,画出真实值对应的曲线和预测值对应的曲线。
true_cases = scaler.inverse_transform(
    np.expand_dims(y_test.flatten(), axis=0)
).flatten()

predicted_cases = scaler.inverse_transform(
  np.expand_dims(np.array(preds).flatten(), axis=0)
).flatten()
print(true_cases.shape, predicted_cases.shape)
# print (type(data))
# print(data[1:3])
# print (len(data), len(data))
# print(data.index[:len(data)])
mse_loss = paddle.nn.MSELoss(reduction='mean')
print(paddle.sqrt(mse_loss(paddle.to_tensor(true_cases), paddle.to_tensor(predicted_cases))))

print(true_cases, predicted_cases)

If you need data, please comment below, and you can also get it by private message.

Don't forget to like, comment, and bookmark, it's really important to me~

おすすめ

転載: blog.csdn.net/un_lock/article/details/122344662