For your own reference only
- Variables in pytorch: tensor
is divided into: data——weight parameter
grad——the derivative of loss to weight.
Tensor can be understood as a data type. This type of variable will generate a calculation graph during calculation, so that it can be convenient through the calculation graph backpropagation.
One error-prone point: If you want to sum the loss, you define a sum, sum+=l. At this time, the sum will become a variable of tensor type, and a calculation graph will be generated, but it is actually unnecessary because it does not affect sum performs backward propagation. So at this time, the variable sum of tensor type will occupy memory. lead to unnecessary memory consumption.
The correct way is: convert l of tensor type to pure python scalar: sum+=l.item()