[Untitled] Detailed explanation of torch.optim.SGD parameters

torch.optim.SGD is the Stochastic Gradient Descent (SGD) optimizer implemented in PyTorch, which is used to update the parameters in the neural network to minimize the loss function, thereby improving the accuracy of the model. Some of its important parameters are as follows: 

- lr: learning rate (learning rate), which controls the step size of each parameter update. The default value is 0.001. 
- momentum: Momentum, which accelerates SGD in the relevant direction and suppresses oscillations. Often the value is 0.9. If set to 0, it is the classic SGD algorithm. 
- dampening: Dampening, used to prevent the divergence of momentum. The default value is 0. 
- weight_decay: weight decay (weight decay), also known as weight decay (weight regularization), is used to prevent overfitting. The default value is 0. 
- nesterov: Nesterov accelerated gradient (NAG) is used. The default value is False.
Here is an example showing how to use torch.optim.SGD:

 

import torch
import torch.nn as nn
import torch.optim as optim

# 定义模型和损失函数
model = nn.Linear(10, 1)
loss_fn = nn.MSELoss()

# 定义优化器
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

# 定义一些训练数据
inputs = torch.randn(100, 10)

Guess you like

Origin blog.csdn.net/weixin_40895135/article/details/130458778