Teach you step by step the principles of recurrent neural networks (RNN) in deep learning based on code

When talking about RNN (Recurrent Neural Network) based machine learning examples, a common task is text generation. RNN is a neural network that can process sequence data and has memory capabilities. The following is an example of text generation based on RNN, with detailed comments added to each line:

  
  
   
   
  1. import torch
  2. import torch.nn as nn
  3. import torch.optim as optim
  4.  
  5. #Define text data set
  6. text = “Hello, how are you?”
  7.  
  8. #Create character index mapping table
  9. chars = list(set(text))
  10. char2idx = { c: i for i, c in enumerate(chars)}
  11. idx2char = { i: c for i, c in enumerate(chars)}
  12.  
  13. # Convert text to a sequence of numbers
  14. data = [char2idx[c] for c in text]

In this example, we first define a text dataset textthat contains the text to be generated.

Next, we created the character index mapping table. We get set(text)the unique characters in the text using , and enumerateassign an index to each character using . char2idxIt is a mapping table from characters to indexes and idx2chara mapping table from indexes to characters.

Then we convert the text into a sequence of numbers. By looping through each character in the text and using char2idxthe index that maps the character to the corresponding one, we get a sequence of numbers as input to our model.

  
  
   
   
  1. #Define RNN model
  2. class RNN(nn.Module):
  3. def init(self, input_size, hidden_size, output_size):
  4. super(RNN, self).init()
  5. self.hidden_size = hidden_size
  6. self.embedding = nn.Embedding(input_size, hidden_size)
  7. self.rnn = nn.RNN(hidden_size, hidden_size)
  8. self.fc = nn.Linear(hidden_size, output_size)
  9.  
  10. def forward(self, x, hidden):
  11. x = self.embedding(x)
  12. x, hidden = self.rnn(x, hidden)
  13. x = self.fc(x)
  14. return x, hidden

Next, we define an RNN model. This model inherits from nn.Module, and initdefines various levels and parameters of the model in the method. The model includes an embedding layer ( embedding), an RNN layer ( rnn), and a linear layer ( fc). During the forward propagation process, we convert the input tensor into a vector representation through the embedding layer, then process the sequence through the RNN layer and output the hidden state, and finally map the hidden state to the output space through the linear layer.

  
  
   
   
  1. # Define model parameters
  2. input_size = len(chars)
  3. hidden_size = 32
  4. output_size = len(chars)
  5.  
  6. # Instantiate model and loss function
  7. rnn = RNN(input_size, hidden_size, output_size)
  8. criterion = nn.CrossEntropyLoss()
  9. optimizer = optim.Adam(rnn.parameters(), lr=0.01)

Then, we defined the parameters of the model, including input size (number of character types), hidden layer size, and output size (number of character types).

Next, we instantiate the RNN model and define the loss function and optimizer. In this example, we use the cross-entropy loss function ( nn.CrossEntropyLoss()) and the Adam optimizer ( optim.Adam()).

  
  
   
   
  1. #Train model
  2. num_epochs = 100
  3. hidden = None
  4. for epoch in range(num_epochs):
  5. inputs = torch.tensor(data[:-1]).unsqueeze(0)
  6. targets = torch.tensor(data[1:]).unsqueeze(0)
  7. optimizer.zero_grad()
  8. outputs, hidden = rnn(inputs, hidden)
  9. loss = criterion(outputs.squeeze(), targets.squeeze())
  10. loss.backward()
  11. optimizer.step()
  12.  
  13. if (epoch+1) % 10 == 0:
  14. print(f‘Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}’)

In the training phase, we use the data to train for multiple epochs. In each epoch, we first load the input sequence and target sequence into the model. We then zero out (pass optimizer.zero_grad()) the gradient cache, perform forward propagation, compute the loss and backpropagation, and update the model's parameters through the optimizer. We also print out the loss for each epoch.

  
  
   
   
  1. # Generate text
  2. with torch.no_grad():
  3. input_char = text[0]
  4. result = input_char
  5. hidden = None
  6. for _ in range(len(text)-1):
  7. input_idx = torch.tensor(char2idx[input_char]).unsqueeze(0)
  8. output, hidden = rnn(input_idx, hidden)
  9. _, top_idx = torch.max(output.squeeze(), dim=1)
  10. predicted_char = idx2char[top_idx.item()]
  11. result += predicted_char
  12. input_char = predicted_char
  13.  
  14. print(“Generated Text:”, result)

In the text generation stage, we use the trained model for text generation. We start from the initial character, iteratively input the character index into the model, obtain the output of the model and select the character corresponding to the highest score as the prediction result. We then add the predicted characters to the result and use the predicted characters as input to the next time step, continuing to iteratively generate the next character until we generate a text sequence that is the same length as the original text.

Finally, we print out the generated text results.

This RNN-based text generation example shows how to use deep learning to generate coherent text. By building a simple RNN model and training it, we are able to generate new text sequences similar to the original text.

Guess you like

Origin blog.csdn.net/qq_15719613/article/details/135068999