Input of RNN LSTM in Pytorch (focus on understanding batch_size/time_steps)

Original link: How to understand the input of RNN LSTM in Pytorch (focus on understanding seq_len/time_steps)-Amao Brownslot's article-Knowing

When building a time series model, if keras is used, we will set the sequence_length in the shape when we Input (these are all denoted by seq_len) , and then we can personalize it in the custom data_generator. This value is also time_steps , which represents the number of cells inside the RNN. Friends who are a little confused can look at the relevant content of the RNN:

CSDN-Professional IT Technology Community-Login​blog.csdn.net

So it is very important to set this value. It, batch_size, feature_dimensions (embedding_size in the word vector) constitute the three dimensions of our Input, whether it is keras/tensorflow or Pytorch, essentially That's it.

Involved in this problem, I heard that Pytorch has a higher degree of freedom. Recently, I started to try Pytorch when I was doing experiments. After I wrote the code and ran through, I realized it after a while. It seems that the seq_len parameter is not used. The sequelae of using Keras too much? (Sure enough, the blogger is stupid!) After checking it, I found that when DataLoader generates data, it is generated by default as (batch_size, 1, feature_dims). (The order of batch_size and seq_len is ignored here. When building the model, for example, nn.LSTM has a batch_first parameter, which determines who is in front and who is behind, but this is not the focus of our discussion here).

So our seq_len/time_steps is set to 1 by default. This is a problem that easily occurs when using Pytorch. Because of the innate interface setting of Keras, we can set seq_len without thinking when Input. This will not become our use of Keras. Pytorch didn’t tell us where to set this parameter, so it might be overlooked if we are not careful.

Okay, let's find out how the problem occurred and how to solve it. Sure enough, the problem still occurred in the DataLoader, where __getitem__(self, index) determines how we retrieve the data. Here I found that I still retrieve the data one by one.

    def __getitem__(self, idx):
        return self.input[idx], self.target[idx]

I don't realize that Torch needs to modify the seq_len here. What should I do next? First, let's look at the "data acquisition method" we hope.

If we have id = 1,2,3,4,5,6,7,8,9,10, a total of 10 samples.

Suppose we set seq_len to 3.

The current data format should be 1-2-3, 2-3-4, 3-4-5, 4-5-6, 5-6-7, 6-7-8, 7-8-9, 8 -9-10, 9-10-0, 10-0-0 (the last two data are incomplete, zero-padded). This is the data that we really have the seq_len parameter, with the concept of "loop", which needs to be put into sequence models such as RNN for processing. So before I said that seq_len was changed to 1 by default. That is to put 10 data in the form of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 into the model training, naturally in the DataLoader The size of the data fetched here becomes (batch_size, 1, feature_dims), and the data we fetch now will be (batch_size, 3, feature_dims).

Suppose we set batch_size to 2.

Then we take out the first batch of 1-2-3, 2-3-4. The size of this batch is (2, 3, feature_dims). We feed this thing into the model.

The next batch is 3-4-5, 4-5-6.

The third batch is 5-6-7 and 6-7-8.

The fourth batch is 7-8-9 and 8-9-10.

The fifth batch is 9-10-0, 10-0-0. Our data generated a total of 5 batches.

As you can see, num_batch = num_samples / batch_size (there is no rounding up or down because in some places it is possible to set whether those incomplete zero- padded batches are required ), seq_len will still not affect the final batch generated The number of batches, only batch_size and num_samples will affect the number of batches.

It may be overlooked that feature_dims is difficult to understand by only relying on id to represent data. Then look at it another way, if feature_dims is 6:

data_ = [[1, 10, 11, 15, 9, 100],
         [2, 11, 12, 16, 9, 100],
         [3, 12, 13, 17, 9, 100],
         [4, 13, 14, 18, 9, 100],
         [5, 14, 15, 19, 9, 100],
         [6, 15, 16, 10, 9, 100],
         [7, 15, 16, 10, 9, 100],
         [8, 15, 16, 10, 9, 100],
         [9, 15, 16, 10, 9, 100],
         [10, 15, 16, 10, 9, 100]]

Still set seq_len to 3 and batch_size to 2.

At this time, our first batch is

tensor([[[  1.,  10.,  11.,  15.,   9., 100.],
         [  2.,  11.,  12.,  16.,   9., 100.],
         [  3.,  12.,  13.,  17.,   9., 100.]],

        [[  2.,  11.,  12.,  16.,   9., 100.],
         [  3.,  12.,  13.,  17.,   9., 100.],
         [  4.,  13.,  14.,  18.,   9., 100.]]])

This is just 1-2-3, 2-3-4.

And the last batch is

tensor([[[  9.,  15.,  16.,  10.,   9., 100.],
         [ 10.,  15.,  16.,  10.,   9., 100.],
         [  0.,   0.,   0.,   0.,   0.,   0.]],

        [[ 10.,  15.,  16.,  10.,   9., 100.],
         [  0.,   0.,   0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.,   0.,   0.]]])

Finally, put the Demo, because everyone's data and even loss, etc. are different, but you should be able to get some ideas on how to modify your Project from the Demo.

# -*- coding: utf-8 -*-

import torch
import torch.utils.data as Data
import torch.nn as nn
import torchvision.transforms as transforms
import numpy as np
###   Demo dataset

data_ = [[1, 10, 11, 15, 9, 100],
         [2, 11, 12, 16, 9, 100],
         [3, 12, 13, 17, 9, 100],
         [4, 13, 14, 18, 9, 100],
         [5, 14, 15, 19, 9, 100],
         [6, 15, 16, 10, 9, 100],
         [7, 15, 16, 10, 9, 100],
         [8, 15, 16, 10, 9, 100],
         [9, 15, 16, 10, 9, 100],
         [10, 15, 16, 10, 9, 100]]


###   Demo Dataset class

class DemoDatasetLSTM(Data.Dataset):

    """
        Support class for the loading and batching of sequences of samples

        Args:
            dataset (Tensor): Tensor containing all the samples
            sequence_length (int): length of the analyzed sequence by the LSTM
            transforms (object torchvision.transform): Pytorch's transforms used to process the data
    """

    ##  Constructor
    def __init__(self, dataset, sequence_length=1, transforms=None):
        self.dataset = dataset
        self.seq_len = sequence_length
        self.transforms = transforms

    ##  Override total dataset's length getter
    def __len__(self):
        return self.dataset.__len__()

    ##  Override single items' getter
    def __getitem__(self, idx):
        if idx + self.seq_len > self.__len__():
            if self.transforms is not None:
                item = torch.zeros(self.seq_len, self.dataset[0].__len__())
                item[:self.__len__()-idx] = self.transforms(self.dataset[idx:])
                return item, item
            else:
                item = []
                item[:self.__len__()-idx] = self.dataset[idx:]
                return item, item
        else:
            if self.transforms is not None:
                return self.transforms(self.dataset[idx:idx+self.seq_len]), self.transforms(self.dataset[idx:idx+self.seq_len])
            else:
                return self.dataset[idx:idx+self.seq_len], self.dataset[idx:idx+self.seq_len]


###   Helper for transforming the data from a list to Tensor

def listToTensor(list):
    tensor = torch.empty(list.__len__(), list[0].__len__())
    for i in range(list.__len__()):
        tensor[i, :] = torch.FloatTensor(list[i])
    return tensor

###   Dataloader instantiation

# Parameters
seq_len = 3
batch_size = 2
data_transform = transforms.Lambda(lambda x: listToTensor(x))

dataset = DemoDatasetLSTM(data_, seq_len, transforms=data_transform)
data_loader = Data.DataLoader(dataset, batch_size, shuffle=False)

for data in data_loader:
    x, _ = data
    print(x)
    print('\n')

Guess you like

Origin blog.csdn.net/ch206265/article/details/106979744