Introduction to common functions of Pytorch neural network (continuously updated...)

1. nn.Linear()

In deep learning,nn.Linear() is the class used in PyTorch to define linear layers. It is used to build the linear layer of the neural network model and linearly transform the input data. The parameter usage of nn.Linear() is as follows:

nn.Linear(in_features, out_features, bias=True)

The parameters are explained as follows:

  • in_features: The size of the input feature, that is, the size of the last dimension of the input tensor. For example, if the shape of the input tensor is (batch_size, in_features), then in_features is the size of the input feature.
  • out_features: The size of the output feature, that is, the size of the last dimension of the linear layer output tensor. If the shape of the input tensor is (batch_size, in_features), then the shape of the output tensor will be (batch_size, out_features).
  • bias: Whether to use offset term. Defaults to True, indicating that the linear layer contains a bias term. If set to False, the linear layer does not contain a bias term.

For example, suppose we have an input tensorx with shape (batch_size, input_size), and we want to input it to a linear layer for transformation, The shape of the output tensor is (batch_size, output_size), you can use the following code:

import torch.nn as nn

input_size = 10
output_size = 5
batch_size = 32

x = torch.randn(batch_size, input_size)

linear_layer = nn.Linear(input_size, output_size)
output = linear_layer(x)

print(output.shape)  # 输出: (32, 5)

In the above code, we create a nn.Linear objectlinear_layer which sets the size of the input features to input_size, the size of the output features is set to output_size. We then pass the input tensor x to linear_layer and get the output tensor output. The shape of the output tensor is (batch_size, output_size), which conforms to the definition of a linear layer.

Note: When defining a linear layer usingnn.Linear(), PyTorch will automatically initialize the weights and bias terms of the linear layer. These initializations can be adjusted by modifying linear_layer.weight and linear_layer.bias.

2. nn.Sequential()

In PyTorch,nn.Sequential() is a container class used to organize and stack multiple neural network layers sequentially. The parameter usage of nn.Sequential() is as follows:

nn.Sequential(*args)

The parameters are explained as follows:

  • *args: A variable number of parameters, each parameter is a layer object (such as nn.Linear, nn.Conv2d, etc.). These layer objects form the network structure of nn.Sequential in sequence.

For example, suppose we want to build a simple neural network, including a linear layer, an activation function layer and an output layer. We can use nn.Sequential() to define this network structure:

import torch.nn as nn

input_size = 10
hidden_size = 20
output_size = 5
batch_size = 32

x = torch.randn(batch_size, input_size)

model = nn.Sequential(
    nn.Linear(input_size, hidden_size),
    nn.ReLU(),
    nn.Linear(hidden_size, output_size)
)

output = model(x)

print(output.shape)  # 输出: (32, 5)

In the above code, we use nn.Sequential() to define a neural network model named model. The model consists of a linear layer (input feature size input_size, output feature size hidden_size), a ReLU activation function layer and a linear layer (input The feature size is hidden_size, and the output feature size is output_size). We pass the input tensor x to model and get the output tensor output.

Withnn.Sequential(), we can easily define a neural network model containing multiple layers without manually specifying the input and output dimensions of each layer. PyTorch will automatically infer the input size of the next layer based on the output size of the previous layer.

3. nn.BatchNorm2d()

In deep learning,nn.BatchNorm2d() is a class in PyTorch used to define a 2D batch normalization layer. It is used to perform batch normalization operations in neural networks to speed up model training and improve the generalization ability of the model. The parameter usage of nn.BatchNorm2d() is as follows:

nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

The parameters are explained as follows:

  • num_features: Enter the number of features or the number of channels. For 2D convolutional layers, num_features is usually equal to the number of output channels of the convolutional layer.
  • eps (optional): Small value for numerical stability. Default is 1e-05.
  • momentum (optional): Momentum used to calculate mean and variance. Default is 0.1.
  • affine (optional): Boolean value, controls whether to add learnable affine transformation parameters. If True, the scaling factor and bias term are learned and applied. Default is True.
  • track_running_stats (optional): Boolean value that controls whether global statistics for the entire dataset are tracked and updated during training. Default is True. During inference, it is recommended to set this to False to use the statistics for each mini-batch.

For example, suppose we have an input tensorx with shape (batch_size, num_channels, height, width) and we want to perform batch normalization after the convolutional layer To operate, you can use the following code:

import torch.nn as nn

batch_size = 32
num_channels = 64
height = 28
width = 28

x = torch.randn(batch_size, num_channels, height, width)

bn_layer = nn.BatchNorm2d(num_channels)
output = bn_layer(x)

print(output.shape)  # 输出: (32, 64, 28, 28)

In the above code, we create a nn.BatchNorm2d objectbn_layer with the number of input features set to num_channels. We then pass the input tensor x to bn_layer and get the output tensor output. The output tensor has the same shape as the input tensor (batch_size, num_channels, height, width), corresponding to the input features of the batch normalization layer.

The batch normalization layer automatically calculates and updates the mean and variance on each feature dimension during the training process, and uses these statistics for normalization operations. During the training process, the batch normalization layer can also control whether to update the global statistics of the entire data set through the track_running_stats parameter.

4. labml

LabML (Lab Meta Learning) is an open source experiment tracking and management tool designed to help researchers better organize and record deep learning experiments. LabML provides a simple yet powerful set of features for tracking all aspects of an experiment, including hyperparameter configuration, model structure, training process, and outcome metrics.

LabML provides Python libraries and related tools that can be used in conjunction with common deep learning frameworks (such as PyTorch, TensorFlow). By using LabML, you can easily record the configuration and parameters of the experiment, track the metrics and logs during the training process, and visualize the experimental results. LabML also supports features such as distributed training, experiment comparison, and automatic checkpointing.

LabML is designed to simplify the process of experimental management and reproduction, provide a consistent experimental recording and tracking mechanism, and help researchers better organize and share experiments. Its goal is to improve the efficiency and reproducibility of deep learning research.

You can find more details, examples, and documentation about LabML on its GitHub page (https://github.com/lab-ml/labml).

4.1 labml.experiment()

labml.experiment()is a function in the LabML library that creates an experiment object to record and manage the configuration, parameters, metrics, and results of a deep learning experiment.

LabML is an open source tool for experiment tracking and management, designed to help researchers better organize and document deep learning experiments. It provides a simple yet powerful set of functions for tracking all aspects of an experiment, including hyperparameter configuration, model structure, training process, and outcome metrics.

labml.experiment()Functions are the entry point for creating LabML experiment objects. By calling this function, you can create an experiment object and use its methods and properties for experiment tracking and recording.

Here are some common usage examples:

import labml

# 创建实验对象
experiment = labml.experiment()

# 记录超参数
experiment.configs({
    
    
    'learning_rate': 0.001,
    'batch_size': 32,
    'num_epochs': 10
})

# 记录模型结构
experiment.add_pytorch_graph(model)

# 记录训练循环
with experiment.train():
    for epoch in range(num_epochs):
        for batch in data_loader:
            # 执行训练步骤

            # 记录训练指标
            experiment.log_metrics({
    
    'loss': loss.item()})

# 记录验证循环
with experiment.validation():
    for batch in validation_loader:
        # 执行验证步骤

        # 记录验证指标
        experiment.log_metrics({
    
    'accuracy': accuracy.item()})

# 记录测试循环
with experiment.test():
    for batch in test_loader:
        # 执行测试步骤

        # 记录测试指标
        experiment.log_metrics({
    
    'accuracy': accuracy.item()})

# 结束实验
experiment.close()

By usinglabml.experiment() to create experimental objects, you can take advantage of the functions provided by LabML to record and track all aspects of the experiment, and obtain detailed logs and result information during the experiment. . You can customize the structure and recording content of the experiment according to your actual needs.

Please note that the above sample code is for demonstration purposes only, and actual usage may vary depending on your specific needs and experimental environment. You can refer to LabML's official documentation and sample code for more detailed usage instructions and examples.

4.1.1 labml.experiment.create()

In LabML, labml.experiment.create() is a function used to create experiments. It allows you to define various parameters and configurations of your experiments. The parameter usage of labml.experiment.create() is as follows:

labml.experiment.create(name=None, comment=None, writers=None, check_git_status=True, include_modules=None)

The parameters are explained as follows:

  • name (optional): The name of the experiment, used to identify the experiment. If not specified, defaults to None.
  • comment (optional): A brief note or description of the experiment. If not specified, defaults to None.
  • writers (optional): Writers for recording experimental results. Can be a single writer object or a list of writer objects. If not specified, the default is None, which means the experimental results will not be recorded.
  • check_git_status (optional): Boolean value indicating whether to check Git status. If True, checks the status of the Git repository, including uncommitted changes and unpushed commits, when creating the experiment. If the Git repository is in a dirty state, an exception is thrown. If False, Git status is not checked. Default is True.
  • include_modules (optional): List of modules to include in the experiment. Can be a single module or a list of modules. The default is None, which means all modules are used.

As an example, here is an example of usinglabml.experiment.create() to create an experiment:

import labml

# 创建实验
labml.experiment.create(
    name='my_experiment',
    comment='This is my experiment',
    writers='tensorboard'
)

# 进行实验运行
for i in range(10):
    # 运行实验步骤
    # ...

    # 记录实验结果
    labml.logger.log('loss', i)

In the code above, we created an experiment named using labml.experiment.create(). We also provide a brief note describing the experiment. The parameter is used, indicating that the TensorBoard writer is used to record the experimental results. my_experimentwriters='tensorboard'

In the loop of experiment running, we can use thelabml.logger.log() function to record the experimental results. In this example, we record i as the value of loss each time through the loop.

LabML provides various writers, such as TensorBoard, CSV, MongoDB, etc. You can choose the appropriate writer according to your needs and use it as the writers parameter Passed tolabml.experiment.create().

5. labml_nn

labml_nn is a Python library. It is part of the LabML project, a library of tools for building neural networks and conducting deep learning experiments. LabML is an open source project that aims to provide tools and frameworks for machine learning experimentation and development.

labml_nn provides a series of advanced neural network models and training tools designed to simplify the process of deep learning experiments. It is built on the PyTorch deep learning framework and provides some additional features and tools to make model definition, training, and evaluation more convenient and scalable.

The functions of labml_nn include but are not limited to:

  1. Model building: labml_nn provides an advanced model building module, allowing users to define complex neural network models using concise code.

  2. Training loop: labml_nn provides a flexible training loop tool that can easily manage the training process of the model, including batch data loading, forward propagation, loss calculation, back propagation and parameter update, etc.

  3. Training monitoring: labml_nn provides tools for real-time monitoring of the training process, including loss curves, metric tracking, and real-time visualization.

  4. Experiment records: labml_nn can help users record and manage the configuration, parameters, indicators and results of deep learning experiments to facilitate experiment reproduction and result analysis.

labml_nn is a lightweight and easy-to-use library suitable for a variety of deep learning tasks and projects. It has good scalability and flexibility and can be customized and expanded according to user needs.

You can get more details and usage examples about labml_nn and other related modules by visiting the LabML project's GitHub page (https://github.com/lab-ml/labml).

Guess you like

Origin blog.csdn.net/weixin_45488428/article/details/130954996