Table of contents
1. Configure the virtual environment
2. Library version introduction
c. forward propagation forward
1. Experiment introduction
- Implement linear model (Linear class)
- Implement forward propagation forward
- Implement reverse propagation backward
2. Experimental environment
This series of experiments uses the PyTorch deep learning framework. The relevant operations are as follows:
1. Configure the virtual environment
conda create -n DL python=3.7
conda activate DL
pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
conda install matplotlib
conda install scikit-learn
2. Library version introduction
software package | This experimental version | The latest version currently |
matplotlib | 3.5.3 | 3.8.0 |
numpy | 1.21.6 | 1.26.0 |
python | 3.7.16 | |
scikit-learn | 0.22.1 | 1.3.0 |
torch | 1.8.1+cu102 | 2.0.1 |
torchaudio | 0.8.1 | 2.0.2 |
torchvision | 0.9.1+cu102 | 0.15.2 |
3. Experimental content
ChatGPT:
Feedforward Neural Network is a common artificial neural network model, also known as Multilayer Perceptron (MLP). It is a model based on forward propagation and is mainly used to solve classification and regression problems.
Feedforward neural network consists of multiple layers, including input layer, hidden layer and output layer. Its name "feedforward" comes from the fact that signals can only flow forward in the network, that is, from the input layer through the hidden layer and finally to the output layer, without feedback connections.
Here's how feedforward neural networks generally work:
Input layer: receives raw data or feature vectors as input to the network, and each input is represented as a neuron of the network. Each neuron weights the input and transforms it through an activation function to produce an output signal.
Hidden layer: A feedforward neural network can contain one or more hidden layers, each consisting of multiple neurons. The neurons in the hidden layer receive input from the previous layer and pass the weighted sum of the signal transformed by the activation function to the next layer.
Output layer: The output of the last hidden layer is passed to the output layer, which usually consists of one or more neurons. The neurons in the output layer use appropriate activation functions (such as Sigmoid, Softmax, etc.) according to the type of problem to be solved (classification or regression) to output the final result.
Forward propagation: The process of transmitting signals from the input layer through the hidden layer to the output layer is called forward propagation. During forward propagation, each neuron multiplies the output of the previous layer by the corresponding weight and passes the result to the next layer. Such calculations are performed layer by layer through each layer in the network until the final output is produced.
Loss function and training: The training process of a feedforward neural network usually involves defining a loss function that measures the difference between the model's predicted output and the true label. Common loss functions include Mean Squared Error and Cross-Entropy. By using backpropagation and optimization algorithms (such as gradient descent), the network adjusts parameters according to the gradient of the loss function to minimize the value of the loss function.
The advantages of feedforward neural networks include the ability to handle complex nonlinear relationships, their suitability for a variety of problem types, and their ability to automatically learn feature representations through training. However, it also has some challenges, such as easy over-fitting and difficulty in processing large-scale data and high-dimensional data. In order to cope with these challenges, some improved network structures and training techniques have been proposed, such as Convolutional Neural Networks and Recurrent Neural Networks.
This series is experimental content and does not explain theoretical knowledge in detail.
(Ahem, I actually don’t have time to sort it out. I’ll come back and fill in the gaps when I have the opportunity)
0. Import necessary toolkits
import torch
1. Linear model Linear class
a.Constructor__init__
def __init__(self, input_size, output_size):
self.params = {}
self.params['W'] = nn.Parameter(torch.randn(input_size, output_size, requires_grad=True))
self.params['b'] = nn.Parameter(torch.randn(1, output_size, requires_grad=True))
self.inputs = None
self.grads = {}
-
Member variables:
params
: Used to save the parameters of the model, including weight matricesW
and bias vectorsb
.inputs
: A variable that holds input data.grads
: A variable that holds the gradient of the parameters.
b. __call__(self, x)
method
__call__(self, x)
Methods enable instances of the class to be called like functions. It calls forward(x)
the method passing x
the input to the forward propagation method.
def __call__(self, x):
return self.forward(x)
c. Forward propagationforward
def forward(self, inputs):
self.inputs = inputs
outputs = torch.matmul(self.inputs, self.params['W']) + self.params['b']
return outputs
In forward propagation, the input data undergoes a linear transformation operation to obtain the output:
- In the constructor, use
nn.Parameter
the randomly initialized weight matrixW
and bias vectorb
to be wrapped into trainable parameters. - In
forward
the method, the input data is multipliedinputs
with the weight matrixW
and then the bias vector is addedb
to obtain the output valueoutputs
. forward
Method returns the calculated output value.
d. Backpropagationbackward
def backward(self, grads=None):
if grads == None:
grads = torch.ones(self.params['W'].shape)
self.grads['w'] = torch.matmul(self.inputs.T, grads)
self.grads['b'] = torch.sum(grads, dim=0)
return torch.matmul(grads, self.params['W'].T)
backward(self, grads=None)
Method performs backpropagation of linear transformations:
- It accepts an optional parameter
grads
to pass the gradient of the output. - If not provided
grads
, it defaults to a tensor of all 1s, indicating that the gradients to the output are all 1. - In a linear transformation, calculating the gradient of the input requires using the gradient of the output and the current input value. Matrix multiplication and summation operations are used here to calculate the gradient of the parameters and the gradient of the input
- Returns the calculated gradient of the input.
2. Model training
net = Linear(4, 2)
x = torch.tensor([1,1,1,1], dtype=torch.float32)
y = net(x)
z = net.backward()
print(z)
- Created an
Linear
instance ofnet;
- Pass in the input tensor
x
for forward propagation; - Call
net.backward()
backpropagation to getx
the gradient of the input - Print the results.
tensor([[-0.8962, -0.9053, -1.5650, -0.3181],
[-0.8962, -0.9053, -1.5650, -0.3181],
[-0.8962, -0.9053, -1.5650, -0.3181],
[-0.8962, -0.9053, -1.5650, -0.3181]], grad_fn=<MmBackward>)
3. Code integration
# 导入必要的工具包
import torch
class Linear:
def __init__(self, input_size, output_size):
self.params = {}
self.params['W'] = nn.Parameter(torch.randn(input_size, output_size, requires_grad=True))
self.params['b'] = nn.Parameter(torch.randn(1, output_size, requires_grad=True))
self.inputs = None
self.grads = {}
def __call__(self, x):
return self.forward(x)
def forward(self, inputs):
self.inputs = inputs
outputs = torch.matmul(self.inputs, self.params['W']) + self.params['b']
return outputs
def backward(self, grads=None):
if grads == None:
grads = torch.ones(self.params['W'].shape)
self.grads['w'] = torch.matmul(self.inputs.T, grads)
self.grads['b'] = torch.sum(grads, dim=0)
return torch.matmul(grads, self.params['W'].T)
net = Linear(4, 2)
x = torch.tensor([1,1,1,1], dtype=torch.float32)
y = net(x)
z = net.backward()
print(z)
Notice:
This experiment only implemented the forward propagation and back propagation parts of the linear model, and lacked the training part of the model. If you want to know what happens next, please listen to the next chapter for decomposition.