PaddlePaddle tutorial Ⅸ——LSTM for Classification & Regression

LSTM实现分类任务

数据集获取

本次使用数据集为MNIST数据集,为了方便起见,这次数据集的读取采用

导入必要的包

import paddle
import paddle.nn as nn
import numpy as np
import os
from paddle.vision.datasets import MNIST
from paddle.io import Dataset
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

paddle.__version__
'2.0.0-rc1'

数据处理

train_dataset = MNIST(mode='train')
val_dataset = MNIST(mode='test')
def plot_num_images(num):
    if num < 1:
        print('INFO:The number of input pictures must be greater than zero!')
    else:
        choose_list = []
        for i in range(num):
            choose_n = np.random.randint(len(train_dataset))
            choose_list.append(choose_n)
        fig = plt.gcf()
        fig.set_size_inches(18, 5)
        for i in range(num):
            ax_img = plt.subplot(2, 8, i + 1)
            plt_img = train_dataset[choose_list[i]][0]
            ax_img.imshow(plt_img, cmap='binary')
            ax_img.set_title(str(train_dataset[choose_list[i]][1].item()),
                             fontsize=10)
        plt.show()
plot_num_images(16)

1

数据装载

class MnnistDataset(Dataset):
    def __init__(self, mode='train'):
        super(MnnistDataset, self).__init__()
        if mode == 'train':
            self.data = [[np.array(train_dataset[i][0]).astype('float32'), train_dataset[i][1].astype('int64')] for i in range(len(train_dataset))]
        else:
            self.data = [[np.array(val_dataset[i][0]).astype('float32'), val_dataset[i][1].astype('int64')] for i in range(len(val_dataset))]

    def __getitem__(self, index):
        data = self.data[index][0]
        label = self.data[index][1]

        return data, label

    def __len__(self):
        return len(self.data)
train_loader = paddle.io.DataLoader(MnnistDataset(mode='train'), batch_size=10000, shuffle=True)
val_loader = paddle.io.DataLoader(MnnistDataset(mode='val'), batch_size=10000, shuffle=True)

建模训练

class RNN(nn.Layer):
    def __init__(self):
        super(RNN, self).__init__()

        self.rnn = nn.LSTM(
            input_size=28,
            hidden_size=64,
            num_layers=1,
        )
        self.out = nn.Linear(64, 10)

    def forward(self, x):
        r_out, (h_n, h_c) = self.rnn(x, None)
        out = self.out(r_out[:, -1, :])
        
        return out
epochs = 20
batch_size = 32
paddle.summary(RNN(), (1, 28, 28))
----------------------------------------------------------------------------------------------
 Layer (type)       Input Shape                   Output Shape                   Param #    
==============================================================================================
    LSTM-6      [[1, 28, 28], None]  [[1, 28, 64], [[1, 1, 64], [1, 1, 64]]]     24,064     
   Linear-6          [[1, 64]]                       [1, 10]                       650      
==============================================================================================
Total params: 24,714
Trainable params: 24,714
Non-trainable params: 0
----------------------------------------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.04
Params size (MB): 0.09
Estimated Total Size (MB): 0.14
----------------------------------------------------------------------------------------------






{'total_params': 24714, 'trainable_params': 24714}
model = paddle.Model(RNN())

model.prepare(optimizer=paddle.optimizer.Adam(learning_rate=0.005, parameters=model.parameters()),
              loss=paddle.nn.CrossEntropyLoss(),
              metrics=paddle.metric.Accuracy())

model.fit(train_loader,
          val_loader,
          epochs=epochs,
          batch_size=batch_size,
          verbose=1)
The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/20
step 6/6 [==============================] - loss: 1.9152 - acc: 0.3026 - 400ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 1.8194 - acc: 0.4652 - 381ms/step
Eval samples: 10000
Epoch 2/20
step 6/6 [==============================] - loss: 1.3814 - acc: 0.5162 - 260ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 1.2895 - acc: 0.6251 - 217ms/step
Eval samples: 10000
Epoch 3/20
step 6/6 [==============================] - loss: 0.9554 - acc: 0.6603 - 285ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.8986 - acc: 0.7120 - 223ms/step
Eval samples: 10000
Epoch 4/20
step 6/6 [==============================] - loss: 0.7356 - acc: 0.7341 - 255ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.6983 - acc: 0.7716 - 279ms/step
Eval samples: 10000
Epoch 5/20
step 6/6 [==============================] - loss: 0.5868 - acc: 0.7831 - 266ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.5834 - acc: 0.8003 - 224ms/step
Eval samples: 10000
Epoch 6/20
step 6/6 [==============================] - loss: 0.5230 - acc: 0.8190 - 258ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.5126 - acc: 0.8278 - 240ms/step
Eval samples: 10000
Epoch 7/20
step 6/6 [==============================] - loss: 0.4556 - acc: 0.8410 - 273ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.4599 - acc: 0.8451 - 244ms/step
Eval samples: 10000
Epoch 8/20
step 6/6 [==============================] - loss: 0.4119 - acc: 0.8561 - 258ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.4258 - acc: 0.8578 - 307ms/step
Eval samples: 10000
Epoch 9/20
step 6/6 [==============================] - loss: 0.3860 - acc: 0.8687 - 249ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.4050 - acc: 0.8619 - 228ms/step
Eval samples: 10000
Epoch 10/20
step 6/6 [==============================] - loss: 0.3763 - acc: 0.8751 - 257ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3850 - acc: 0.8689 - 243ms/step
Eval samples: 10000
Epoch 11/20
step 6/6 [==============================] - loss: 0.3569 - acc: 0.8821 - 268ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3643 - acc: 0.8756 - 240ms/step
Eval samples: 10000
Epoch 12/20
step 6/6 [==============================] - loss: 0.3407 - acc: 0.8884 - 258ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3515 - acc: 0.8801 - 247ms/step
Eval samples: 10000
Epoch 13/20
step 6/6 [==============================] - loss: 0.3212 - acc: 0.8932 - 266ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3413 - acc: 0.8839 - 224ms/step
Eval samples: 10000
Epoch 14/20
step 6/6 [==============================] - loss: 0.3172 - acc: 0.8977 - 253ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3309 - acc: 0.8860 - 228ms/step
Eval samples: 10000
Epoch 15/20
step 6/6 [==============================] - loss: 0.2890 - acc: 0.9016 - 254ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3250 - acc: 0.8892 - 303ms/step
Eval samples: 10000
Epoch 16/20
step 6/6 [==============================] - loss: 0.2911 - acc: 0.9046 - 242ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3183 - acc: 0.8921 - 225ms/step
Eval samples: 10000
Epoch 17/20
step 6/6 [==============================] - loss: 0.2639 - acc: 0.9083 - 279ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3128 - acc: 0.8922 - 255ms/step
Eval samples: 10000
Epoch 18/20
step 6/6 [==============================] - loss: 0.2732 - acc: 0.9118 - 277ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3074 - acc: 0.8954 - 567ms/step
Eval samples: 10000
Epoch 19/20
step 6/6 [==============================] - loss: 0.2579 - acc: 0.9142 - 362ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.3046 - acc: 0.8968 - 281ms/step
Eval samples: 10000
Epoch 20/20
step 6/6 [==============================] - loss: 0.2587 - acc: 0.9173 - 354ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 [==============================] - loss: 0.2984 - acc: 0.8977 - 350ms/step
Eval samples: 10000
model.evaluate(val_loader)
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 1/1 - loss: 0.2984 - acc: 0.8977 - 307ms/step
Eval samples: 10000





{'loss': [0.29836994], 'acc': 0.8977}

LSTM实现回归任务

数据集获取

数据集可在AIStudio平台进行下载
波士顿房价数据集包括506个样本,每个样本包括12个特征变量和该地区的平均房价,房价(单价)显然和多个特征变量相关,不是单变量线性回归(一元线性回归)问题,选择多个特征变量来建立线性方程,这就是多变量线性回归(多元线性回归)问题

变量 含义
CRIM 城镇人均犯罪率
ZN 住宅用地超过25,000平方英尺以上的比例
INDUS 城镇非零售商用土地的比例(即工业或农业等用地比例)
CHAS 查尔斯河虚拟变量(边界是河,则为1;否则为0)
NOX 一氧化氮浓度(百万分之几)
RM 每个住宅的平均房间数
AGE 1940 年之前建成的自用房屋比例
DIS 到波士顿五个中心区域的加权距离(与繁华闹市的距离,区分郊区与市区)
RAD 高速公路通行能力指数(辐射性公路的靠近指数)
TAX 每10,000美元的全额财产税率
PTRATIO 按镇划分的城镇师生比例
B 1000(Bk-0.63)^2其中Bk是城镇黑人比例
LSTAT 人口中地位低下者的百分比
MEDV 自有住房的中位数价值(单位:千美元)
import pandas as pd
paddle.seed(7)

data_frame = pd.read_csv('./housing.csv', header=None)
all_data = []

for column in range(len(data_frame)):
    column_data = []
    for data in list(data_frame.iloc[column])[0].split(' '):
        if data != '':
            column_data.append(float(data))
    all_data.append(column_data)

all_data = np.array(all_data)
x_data = all_data[:, :13]
y_data = all_data[:, 13]
print('x_data:\n', x_data, '\n x_data shape:', x_data.shape,
      '\ny_data:\n', y_data, '\n y_data shape:', y_data.shape)
x_data:
 [[6.3200e-03 1.8000e+01 2.3100e+00 ... 1.5300e+01 3.9690e+02 4.9800e+00]
 [2.7310e-02 0.0000e+00 7.0700e+00 ... 1.7800e+01 3.9690e+02 9.1400e+00]
 [2.7290e-02 0.0000e+00 7.0700e+00 ... 1.7800e+01 3.9283e+02 4.0300e+00]
 ...
 [6.0760e-02 0.0000e+00 1.1930e+01 ... 2.1000e+01 3.9690e+02 5.6400e+00]
 [1.0959e-01 0.0000e+00 1.1930e+01 ... 2.1000e+01 3.9345e+02 6.4800e+00]
 [4.7410e-02 0.0000e+00 1.1930e+01 ... 2.1000e+01 3.9690e+02 7.8800e+00]] 
 x_data shape: (506, 13) 
y_data:
 [24.  21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 15.  18.9 21.7 20.4
 18.2 19.9 23.1 17.5 20.2 18.2 13.6 19.6 15.2 14.5 15.6 13.9 16.6 14.8
 18.4 21.  12.7 14.5 13.2 13.1 13.5 18.9 20.  21.  24.7 30.8 34.9 26.6
 25.3 24.7 21.2 19.3 20.  16.6 14.4 19.4 19.7 20.5 25.  23.4 18.9 35.4
 24.7 31.6 23.3 19.6 18.7 16.  22.2 25.  33.  23.5 19.4 22.  17.4 20.9
 24.2 21.7 22.8 23.4 24.1 21.4 20.  20.8 21.2 20.3 28.  23.9 24.8 22.9
 23.9 26.6 22.5 22.2 23.6 28.7 22.6 22.  22.9 25.  20.6 28.4 21.4 38.7
 43.8 33.2 27.5 26.5 18.6 19.3 20.1 19.5 19.5 20.4 19.8 19.4 21.7 22.8
 18.8 18.7 18.5 18.3 21.2 19.2 20.4 19.3 22.  20.3 20.5 17.3 18.8 21.4
 15.7 16.2 18.  14.3 19.2 19.6 23.  18.4 15.6 18.1 17.4 17.1 13.3 17.8
 14.  14.4 13.4 15.6 11.8 13.8 15.6 14.6 17.8 15.4 21.5 19.6 15.3 19.4
 17.  15.6 13.1 41.3 24.3 23.3 27.  50.  50.  50.  22.7 25.  50.  23.8
 23.8 22.3 17.4 19.1 23.1 23.6 22.6 29.4 23.2 24.6 29.9 37.2 39.8 36.2
 37.9 32.5 26.4 29.6 50.  32.  29.8 34.9 37.  30.5 36.4 31.1 29.1 50.
 33.3 30.3 34.6 34.9 32.9 24.1 42.3 48.5 50.  22.6 24.4 22.5 24.4 20.
 21.7 19.3 22.4 28.1 23.7 25.  23.3 28.7 21.5 23.  26.7 21.7 27.5 30.1
 44.8 50.  37.6 31.6 46.7 31.5 24.3 31.7 41.7 48.3 29.  24.  25.1 31.5
 23.7 23.3 22.  20.1 22.2 23.7 17.6 18.5 24.3 20.5 24.5 26.2 24.4 24.8
 29.6 42.8 21.9 20.9 44.  50.  36.  30.1 33.8 43.1 48.8 31.  36.5 22.8
 30.7 50.  43.5 20.7 21.1 25.2 24.4 35.2 32.4 32.  33.2 33.1 29.1 35.1
 45.4 35.4 46.  50.  32.2 22.  20.1 23.2 22.3 24.8 28.5 37.3 27.9 23.9
 21.7 28.6 27.1 20.3 22.5 29.  24.8 22.  26.4 33.1 36.1 28.4 33.4 28.2
 22.8 20.3 16.1 22.1 19.4 21.6 23.8 16.2 17.8 19.8 23.1 21.  23.8 23.1
 20.4 18.5 25.  24.6 23.  22.2 19.3 22.6 19.8 17.1 19.4 22.2 20.7 21.1
 19.5 18.5 20.6 19.  18.7 32.7 16.5 23.9 31.2 17.5 17.2 23.1 24.5 26.6
 22.9 24.1 18.6 30.1 18.2 20.6 17.8 21.7 22.7 22.6 25.  19.9 20.8 16.8
 21.9 27.5 21.9 23.1 50.  50.  50.  50.  50.  13.8 13.8 15.  13.9 13.3
 13.1 10.2 10.4 10.9 11.3 12.3  8.8  7.2 10.5  7.4 10.2 11.5 15.1 23.2
  9.7 13.8 12.7 13.1 12.5  8.5  5.   6.3  5.6  7.2 12.1  8.3  8.5  5.
 11.9 27.9 17.2 27.5 15.  17.2 17.9 16.3  7.   7.2  7.5 10.4  8.8  8.4
 16.7 14.2 20.8 13.4 11.7  8.3 10.2 10.9 11.   9.5 14.5 14.1 16.1 14.3
 11.7 13.4  9.6  8.7  8.4 12.8 10.5 17.1 18.4 15.4 10.8 11.8 14.9 12.6
 14.1 13.  13.4 15.2 16.1 17.8 14.9 14.1 12.7 13.5 14.9 20.  16.4 17.7
 19.5 20.2 21.4 19.9 19.  19.1 19.1 20.1 19.9 19.6 23.2 29.8 13.8 13.3
 16.7 12.  14.6 21.4 23.  23.7 25.  21.8 20.6 21.2 19.1 20.6 15.2  7.
  8.1 13.6 20.1 21.8 24.5 23.1 19.7 18.3 21.2 17.5 16.8 22.4 20.6 23.9
 22.  11.9] 
 y_data shape: (506,)

数据集装载

class BreastDataset(Dataset):
    def __init__(self, mode='train'):
        super(BreastDataset, self).__init__()
        self.data = [[x_data[i].reshape(-1, 1).astype('float32'), y_data[i].astype('float32')] for i in range(x_data.shape[0])]

    def __getitem__(self, index):
        data = self.data[index][0]
        label = self.data[index][1]

        return data, label

    def __len__(self):
        return len(self.data)
train_loader = paddle.io.DataLoader(BreastDataset(mode='train'), batch_size=100, shuffle=True)

建模训练

class BreastRNN(nn.Layer):
    def __init__(self):
        super(BreastRNN, self).__init__()

        self.rnn = nn.LSTM(
            input_size=1,
            hidden_size=64,
            num_layers=1,
        )
        self.out = nn.Linear(64, 1)

    def forward(self, x):
        r_out, (h_n, h_c) = self.rnn(x, None)
        out = self.out(r_out[:, -1, :])
        
        return out
paddle.summary(BreastRNN(), (1, 13, 1))
----------------------------------------------------------------------------------------------
 Layer (type)       Input Shape                   Output Shape                   Param #    
==============================================================================================
    LSTM-1       [[1, 13, 1], None]  [[1, 13, 64], [[1, 1, 64], [1, 1, 64]]]     17,152     
   Linear-1          [[1, 64]]                       [1, 1]                        65       
==============================================================================================
Total params: 17,217
Trainable params: 17,217
Non-trainable params: 0
----------------------------------------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.04
Params size (MB): 0.07
Estimated Total Size (MB): 0.10
----------------------------------------------------------------------------------------------






{'total_params': 17217, 'trainable_params': 17217}
epochs = 20
batch_size = 32
model = paddle.Model(BreastRNN())

model.prepare(optimizer=paddle.optimizer.Adam(learning_rate=0.005, parameters=model.parameters()),
              loss=paddle.nn.MSELoss())

model.fit(train_loader,
          epochs=epochs,
          batch_size=batch_size,
          verbose=1)
The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/20
step 6/6 [==============================] - loss: 846.5749 - 12ms/step
Epoch 2/20
step 6/6 [==============================] - loss: 464.9476 - 8ms/step
Epoch 3/20
step 6/6 [==============================] - loss: 152.8017 - 9ms/step
Epoch 4/20
step 6/6 [==============================] - loss: 322.6063 - 8ms/step
Epoch 5/20
step 6/6 [==============================] - loss: 77.3828 - 8ms/step
Epoch 6/20
step 6/6 [==============================] - loss: 51.7485 - 9ms/step
Epoch 7/20
step 6/6 [==============================] - loss: 247.1716 - 9ms/step
Epoch 8/20
step 6/6 [==============================] - loss: 66.1677 - 12ms/step
Epoch 9/20
step 6/6 [==============================] - loss: 19.2254 - 8ms/step
Epoch 10/20
step 6/6 [==============================] - loss: 55.7542 - 10ms/step
Epoch 11/20
step 6/6 [==============================] - loss: 103.3308 - 10ms/step
Epoch 12/20
step 6/6 [==============================] - loss: 166.6900 - 17ms/step
Epoch 13/20
step 6/6 [==============================] - loss: 62.8693 - 13ms/step
Epoch 14/20
step 6/6 [==============================] - loss: 141.1141 - 12ms/step
Epoch 15/20
step 6/6 [==============================] - loss: 58.6999 - 8ms/step
Epoch 16/20
step 6/6 [==============================] - loss: 109.8835 - 7ms/step
Epoch 17/20
step 6/6 [==============================] - loss: 50.1934 - 8ms/step
Epoch 18/20
step 6/6 [==============================] - loss: 47.7691 - 10ms/step
Epoch 19/20
step 6/6 [==============================] - loss: 56.6143 - 8ms/step
Epoch 20/20
step 6/6 [==============================] - loss: 51.1868 - 9ms/step

猜你喜欢

转载自blog.csdn.net/qq_39567427/article/details/112919737