1. Non-linear Activations
Non-linear activation function official documentation: Non-linear Activations .
Students with a foundation in deep learning should know that the most commonly used nonlinear activation functions are ReLU and Sigmoid functions. Multi-classification problems will use the Softmax function in the output layer. These three functions in PyTorch are nn.ReLU
, , nn.Sigmoid
and nn.Softmax
.
The input of these two functions only needs to specify the batch_size. After PyTorch1.0, data of any shape can be calculated without specifying the batch_size.
nn.ReLU
There is only one parameter that needs to be set inplace
. If it is True, the calculation result is directly replaced on the input data, for example:
input = -1
nn.ReLU(input, inplace=True)
# input = 0
The code to build the ReLU layer is as follows:
import torch
import torch.nn as nn
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.relu1 = nn.ReLU()
def forward(self, input):
output = self.relu1(input)
return output
network = Network()
input = torch.tensor([
[1, -0.5],
[-1, 3]
])
output = network(input)
print(output)
# tensor([[1., 0.],
# [0., 3.]])
Then we use Sigmoid to process the image:
from torchvision import transforms, datasets
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import torch.nn as nn
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.sigmoid1 = nn.Sigmoid()
def forward(self, input):
output = self.sigmoid1(input)
return output
test_set = datasets.CIFAR10('dataset/CIFAR10', train=False, transform=transforms.ToTensor())
data_loader = DataLoader(test_set, batch_size=64)
network = Network()
writer = SummaryWriter('logs')
for step, data in enumerate(data_loader):
imgs, targets = data
output = network(imgs)
writer.add_images('input', imgs, step)
writer.add_images('output', output, step)
writer.close()
The result is as follows:
The purpose of non-linear activation is to introduce some non-linear features into the network, because the more non-linear features are, the more models that fit various curves (features) can be trained.
2. Linear Layers
Linear layer official documentation: Linear Layers .
PyTorch nn.Linear
is used to set the fully connected layer in the network. It should be noted that the input and output of the fully connected [batch_size, size]
layer are two-dimensional tensors. Therefore, the image is generally expanded into one-dimensional before being passed into the fully connected layer.
nn.Linear
There are three parameters as follows:
in_features
: refers to the size of the input two-dimensional tensor, that is,[batch_size, size]
insize
.out_features
: refers to the size of the output two-dimensional tensor, that is, the shape of the output two-dimensional tensor is[batch_size, output_size]
, of course, it also represents the number of neurons in the fully connected layer. From the perspective of the shape of the input and output tensor, it is equivalent to transforming an[batch_size, in_features]
input tensor into[batch_size, out_features]
an output tensor.bias
: Bias, equivalent to b in y = ax + b.
The code example is as follows:
import torch
import torch.nn as nn
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.linear1 = nn.Linear(24, 30)
def forward(self, input):
output = self.linear1(input)
return output
input = torch.tensor([
[1, 2, 3, 0, 1, 2, 3, 0],
[0, 1, 2, 3, 0, 1, 2, 3],
[3, 0, 1, 2, 3, 0, 1, 2],
], dtype=torch.float32)
print(input.shape) # torch.Size([3, 8])
input = torch.flatten(input) # 将 input 拉平成一维
print(input.shape) # torch.Size([24])
network = Network()
output = network(input)
print(output.shape) # torch.Size([30])