Learning process notes (2) pytorch official document learning notes

reference:

[Quickly understand tensors] Explain tensor in a popular way through torch.rand and examples_Neo's very hard-working blog-CSDN blog

Pytorch official document study notes - 3. Build Model_pytorch build_model_Coding_Qi's blog-CSDN blog

 http://t.csdn.cn/66ERp

Quickstart — PyTorch Tutorials 2.0.1+cu117 documentation  (2 messages) pytorch basics - optimizing model parameters (6)_torch.optim.sgd(model.parameters(), lr=learning_ra_A little prairie dog's blog -CSDN Blog

1. Tensor tensor

-Data structure (similar to array matrix) - use tensors to encode the input, output, and parameters of the model

-last column-1

- concatenate tensors, .cat

 dim refers to the dimension, without square brackets, dim =0; 1 square bracket, dim =1;

 -matplotlib is a 2D plotting library for Python

2. Data sets and data loaders

figure: feature; Label: label;

&Custom data set: three functions

_init_: Run once when instantiating the dataset object, initializing the image, annotation file, and two transformations (transform, target_transform)

_len_: Returns the number of samples in the data set

_getitem_: From the given index, convert the image label to a tensor and return the tensor image and corresponding label

3. Transformation

transform和target_transform

Features as normalized tensors and labels as one-hot encoded tensors, using .ToTensor和Lambda进行转换

transform=ToTensor(),

target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))

&ToTensor converts a PIL image or NumPy to a. Such situation, and the pixel intensity value of the image is in the range [0., 1.]

&Lambda conversion: is a Lambda function that converts integers into one-hot encoded tensors.

The natural status code is: 000,001,010,011,100,101.
The one-hot encoding is: 000001,000010,000100,001000,010000,100000

target_transform = Lambda(lambda y: torch.zeros(
    10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))
首先创建一个大小为 10(我们数据集中的标签数量)的零张量,并调用 scatter_ 在标签给出的索引上分配一个

scatter_(input, dim, index, src): Fill the data in src into the input in the direction of dim according to the index in index. It can be understood as placing elements or modifying elements     

  • dim: along which dimension to index
  • index: element index used for scatter
  • src: the source element used for scatter, which can be a scalar or a tensor

4. Build the model

- Define the neural network by subclassing nn.Module, use __init__ to initialize the neural network layer, and each nn.Module subclass implements the operation of the input data in the forward method

The -Flatten layer is used to "flatten" the input, that is, to convert the multi-dimensional input into one dimension. It is often used in the transition from the convolutional layer to the fully connected layer . (3, 32, 64) is a three-dimensional data, with a total of 3*32*64=6144 elements. Pull this three-dimensional data into a line, and the length of the line is 6144.

-nn.Sequential Class is  torch.nn a kind of sequence container in the container. By nesting various classes related to the specific functions of the neural network in the container, the construction of the neural network model is completed; the content in the brackets of this class is the neural network model we built. specific structure

-nn.Linear: used to define the linear layer of the model and complete the linear transformation mentioned above. The parameters are (number of input features, number of output features, whether to use bias (default is true)), and the weights of the corresponding dimensions will be automatically generated. Parameters and biases

-nn.ReLU The class belongs to the non-linear activation classification and does not require parameters to be passed by default when defining.

-logits is a vector that is usually thrown to  softmax in the next step  . softmax normalized exponential function

-The function of torch.rand in layman’s terms is to generate uniformly distributed data. Enter a few numbers in the brackets of torch.rand() to generate a tensor of several dimensions.

x = torch.rand(3,4): 2-dimensional tensor, three rows and four columns

It is relatively easy to understand three-dimensional tensors. A two-dimensional tensor can be regarded as a plane, while a three-dimensional tensor can be regarded as many two-dimensional tensor planes placed in parallel.

For example, our common RGB image can be understood as three two-dimensional grayscale images placed side by side.

5. Autograd automatic derivation 

Gradient: the derivative of the loss function with respect to the parameters

Backpropagation algorithm: The parameters (model weights) are adjusted according to the gradient of the loss function with respect to the given parameters.

 torch.autograd  supports automatic calculation of gradients for any computational graph .

6. Optimization optimization parameters

-During each iteration of training the model, the model makes a guess at the output, calculates the error between the guess and the actual label, collects the derivatives of the error with respect to its parameters, and optimizes these parameters using gradient descent

-Control the model optimization process by adjusting hyperparameters. Different hyperparameter values ​​will affect model training and convergence speed

-Loss: Make a prediction using the input of a given data sample and compare it with the true data label value.

-SGD optimizer

-Training loop, three steps for optimization

·Call optimizer.zero_grad() to reset the gradient of the model parameters. The gradients are summed by default; to prevent double counting, we explicitly zero them out on each iteration.
· Backpropagate the prediction loss by calling loss.backward(). PyTorch stores the gradient of each parameter associated with the loss.
·Once we have the gradients, we call optimizer.step() to adjust the parameters through the gradients collected during backpropagation.

Training loop optimization code:

def train_loop(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    for batch, (X, y) in enumerate(dataloader):
        # Compute prediction and loss
        pred = model(X)
        loss = loss_fn(pred, y)
 
        # Backpropagation
        optimizer.zero_grad() #用于清空优化器中的梯度
        loss.backward()    #计算损失函数对参数的梯度,自动求导
        optimizer.step()    #根据梯度更新网络参数的值
 
        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

Use the gradient descent algorithm to update the model parameters. In this algorithm, you need to calculate the gradient of the loss function with respect to the model parameters. This calculation process is the backpropagation algorithm.

The function of loss.backward() is to derive the loss function and obtain the gradient of each model parameter with respect to the loss function. This gradient can represent the contribution size and direction of the model parameters to the loss function in the current state, that is, the direction and size of the parameter update.

The updated parameters will be used for the next forward pass calculation and backpropagation calculation.

7. Save & Load Model Save and load the model

PyTorch models store learned parameters in   an internal state dictionary called state_dict . These parameters can be saved through the torch.save method


 

Guess you like

Origin blog.csdn.net/qq_51141671/article/details/131788405