[Deep Learning Experiment] Feedforward Neural Network (1): Basic steps to build a neural network using PyTorch

Table of contents

1. Experiment introduction

2. Experimental environment

1. Configure the virtual environment

2. Library version introduction

3. Experimental content

0. Import library

1. Define x,w,b

2. Calculate the net activity value z

3. Instantiate the linear layer and perform forward propagation

4. Print the results

5. Code integration

1. Experiment introduction

This experiment uses the PyTorch library to build and operate neural network models, mainly about the use of linear layers.

2. Experimental environment

This series of experiments uses the PyTorch deep learning framework. The relevant operations are as follows:

1. Configure the virtual environment

conda create -n DL python=3.7

conda activate DL

pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

conda install matplotlib

 conda install scikit-learn

2. Library version introduction

software package	This experimental version	The latest version currently
matplotlib	3.5.3	3.8.0
numpy	1.21.6	1.26.0
python	3.7.16
scikit-learn	0.22.1	1.3.0
torch	1.8.1+cu102	2.0.1
torchaudio	0.8.1	2.0.2
torchvision	0.9.1+cu102	0.15.2

3. Experimental content

ChatGPT：

        Feedforward Neural Network is a common artificial neural network model, also known as Multilayer Perceptron (MLP). It is a model based on forward propagation and is mainly used to solve classification and regression problems.

        Feedforward neural network consists of multiple layers, including input layer, hidden layer and output layer. Its name "feedforward" comes from the fact that signals can only flow forward in the network, that is, from the input layer through the hidden layer and finally to the output layer, without feedback connections.

Here's how feedforward neural networks generally work:

Input layer: receives raw data or feature vectors as input to the network, and each input is represented as a neuron of the network. Each neuron weights the input and transforms it through an activation function to produce an output signal.

Hidden layer: A feedforward neural network can contain one or more hidden layers, each consisting of multiple neurons. The neurons in the hidden layer receive input from the previous layer and pass the weighted sum of the signal transformed by the activation function to the next layer.

Output layer: The output of the last hidden layer is passed to the output layer, which usually consists of one or more neurons. The neurons in the output layer use appropriate activation functions (such as Sigmoid, Softmax, etc.) according to the type of problem to be solved (classification or regression) to output the final result.

Forward propagation: The process of transmitting signals from the input layer through the hidden layer to the output layer is called forward propagation. During forward propagation, each neuron multiplies the output of the previous layer by the corresponding weight and passes the result to the next layer. Such calculations are performed layer by layer through each layer in the network until the final output is produced.

Loss function and training: The training process of a feedforward neural network usually involves defining a loss function that measures the difference between the model's predicted output and the true label. Common loss functions include Mean Squared Error and Cross-Entropy. By using backpropagation and optimization algorithms (such as gradient descent), the network adjusts parameters according to the gradient of the loss function to minimize the value of the loss function.

        The advantages of feedforward neural networks include the ability to handle complex nonlinear relationships, their suitability for a variety of problem types, and their ability to automatically learn feature representations through training. However, it also has some challenges, such as easy over-fitting and difficulty in processing large-scale data and high-dimensional data. In order to cope with these challenges, some improved network structures and training techniques have been proposed, such as Convolutional Neural Networks and Recurrent Neural Networks.

This series is experimental content and does not explain theoretical knowledge in detail.

(Ahem, I actually don’t have time to sort it out. I’ll come back and fill in the gaps when I have the opportunity)

0. Import library

Relevant modules from the PyTorch library and some external libraries for plotting and loading datasets are introduced.

import torch
from torch import nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

Load the iris data set (the code here does not give specific steps to load the data set).

1. Define x,w,b

Define the input tensor x, weight tensor w and bias term tensor b of the neural network model:

x = torch.randn((2, 5))
w = torch.randn((5, 1))
b = torch.randn((1, 1))

2. Calculate the net activity value z

z = torch.matmul(x, w) + b
z_2 = x @ w + b

The net activity value z is calculated by matrix multiplication, where x represents the input feature, w represents the weight, and b represents the bias term. Both writing methods are equivalent, and you can use the `torch.matmul()` function or the `@` operator to perform matrix multiplication operations.

3. Instantiate the linear layer and perform forward propagation

net = nn.Linear(5, 1)
z_3 = net(x)

The `nn.Linear()` function instantiates a linear layer, specifying an input dimension of 5 and an output dimension of 1. Then the input tensor x is passed to the linear layer for forward propagation calculation, and the output tensor z_3 is obtained.

4. Print the results

print('output z:', z)
print('shape of z: ', z.shape)
print('output z_2:', z_2)
print('shape of z:', z_2.shape)
print('output z2: ', z_3)
print('shape of z2:', z_3.shape)

Print the calculation results and the shape information of the tensor (for easy viewing and debugging).

5. Code integration

# 导入必要的工具包
import torch
from torch import nn

# x 表示两个含有5个特征的样本，x是一个二维的tensor
x = torch.randn((2, 5))
# w 表示含有5个参数的权重向量，w是一个二维的tensor
w = torch.randn((5, 1))
# 偏置项，b是一个二维的tensor，但b只有一个数值
b = torch.randn((1, 1))
# 矩阵乘法，请注意 x 和 w 的顺序，与 b 相加时使用了广播机制
z = torch.matmul(x, w) + b
# 另一种写法
z_2 = x @ w + b
# 打印结果，z是一个二维的tensor，表示两个样本经过神经元后的各自净活性值
print('output z:', z)
print('shape of z: ', z.shape)
print('output z_2:', z_2)
print('shape of z:', z_2.shape)

# 实例化一个线性层，接受输入维度是5，输出维度是1
net = nn.Linear(5, 1)
z_3 = net(x)
# 打印结果，z2的形状与z一样，含义也与z一样
print('output z2: ', z_3)
print('shape of z2:', z_3.shape)