RNN代码解释pytorch

简述

还是跟之前的CNN一样，都是学于莫烦Python的。

解释

关于数据导入部分的代码含义，其实跟之前的CNN几乎完全一致。
而且还需要部分的源代码–MNIST（在之前的地方有超链接）
这些都可以在下面的CNN的链接中看到
卷积神经网络CNN入门【pytorch学习】

模型含义

这里使用RNN，这是跟之前的CNN唯一的不同的地方，其他的都是完全一致的。

class RNN(nn.Module):
    def __init__(self):
        super(RNN, self).__init__()
        self.rnn = nn.LSTM(
            input_size=28,
            hidden_size=64,
            num_layers=1,
            batch_first=True
        )
        self.out = nn.Linear(64, 10)  # fully connected layer, output 10 classes

    def forward(self, x):
        r_out, (h_n, h_c) = self.rnn(x, None)  # None 表示 hidden state 会用全0的 state
        # r_out = [BATCH_SIZE, input_size, hidden_size]
        # r_out[:, -1, :] = [BATCH_SIZE, hidden_size]  '-1'，表示选取最后一个时间点的 r_out 输出
        out = self.out(r_out[:, -1, :])
        # out = [BATCH_SIZE, 10]
        return out


rnn = RNN()

LSTM参数解释

输入参数，其实是表示有多少序列。这里的最小单位，考虑的其实不是整个图片的完整全部序列。而是每一行为最小单位的。
所以说经过LSTM之后，输出的结果就是r_out = [BATCH_SIZE, input_size, hidden_size]。第一个input_size其实是恰好这个图片大小是（input_size, input_size）的

out中输入的有-1

会发现这里有一个数字-1，其实就是表示要选最后的一列作为最后的结果。其实就是说只看最后的一行。

完整代码

import os

import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision

# Hyper Parameters
EPOCH = 1  # train the training data n times, to save time, we just train 1 epoch
BATCH_SIZE = 50
LR = 0.001  # learning rate
DOWNLOAD_MNIST = False

# Mnist digits dataset
if not (os.path.exists('./mnist/')) or not os.listdir('./mnist/'):
    # not mnist dir or mnist is empyt dir
    DOWNLOAD_MNIST = True

train_data = torchvision.datasets.MNIST(
    root='./mnist/',
    train=True,  # this is training data
    transform=torchvision.transforms.ToTensor(),  # Converts a PIL.Image or numpy.ndarray to
    # torch.FloatTensor of shape (C x H x W) and normalize in the range [0.0, 1.0]
    download=DOWNLOAD_MNIST,
)

# Data Loader for easy mini-batch return in training, the image batch shape will be (50, 1, 28, 28)
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)

# pick 2000 samples to speed up testing
test_data = torchvision.datasets.MNIST(root='./mnist/', train=False)
test_x = torch.unsqueeze(test_data.test_data, dim=1).type(torch.FloatTensor)[
         :2000] / 255.  # shape from (2000, 28, 28) to (2000, 1, 28, 28), value in range(0,1)
test_y = test_data.test_labels[:2000]


class RNN(nn.Module):
    def __init__(self):
        super(RNN, self).__init__()
        self.rnn = nn.LSTM(
            input_size=28,
            hidden_size=64,
            num_layers=1,
            batch_first=True
        )
        self.out = nn.Linear(64, 10)  # fully connected layer, output 10 classes

    def forward(self, x):
        r_out, (h_n, h_c) = self.rnn(x, None)  # None 表示 hidden state 会用全0的 state
        # r_out = [BATCH_SIZE, input_size, hidden_size]
        # r_out[:, -1, :] = [BATCH_SIZE, hidden_size]  '-1'，表示选取最后一个时间点的 r_out 输出
        out = self.out(r_out[:, -1, :])
        # out = [BATCH_SIZE, 10]
        return out


rnn = RNN()

optimizer = torch.optim.Adam(rnn.parameters(), lr=LR)  # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss()  # the target label is not one-hotted

for epoch in range(EPOCH):
    for step, (x, b_y) in enumerate(train_loader):  # gives batch data
        b_x = x.view(-1, 28, 28)  # reshape x to (batch, time_step, input_size)
        output = rnn(b_x)  # rnn output
        loss = loss_func(output, b_y)  # cross entropy loss
        optimizer.zero_grad()  # clear gradients for this training step
        loss.backward()  # backpropagation, compute gradients
        optimizer.step()

test_output = rnn(test_x[:10].view(-1, 28, 28))
pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze()

print(pred_y, 'prediction number')
print(test_y[:10], 'real number')

结果：

[7 2 1 0 4 1 4 9 6 9] prediction number
tensor([7, 2, 1, 0, 4, 1, 4, 9, 5, 9]) real number

简述

解释

模型含义

LSTM参数解释

out中输入的有-1

完整代码

猜你喜欢