In PyTorch, autograd is the core content of all neural networks and provides automatic derivation methods for all Tensor operations.
It is a framework defined by the way it runs, which means that backprop is defined by the way the code runs.
一、Variable
autograd.Variable is the core class in autograd. It wraps a Tensor and supports almost all operations defined on it. Once your calculation is completed, you can call .backward () to automatically calculate all gradients.
Variable has three attributes: data, grad and creator.
Access the original tensor using attribute .data; the gradient of this Variable is focused on .grad; .creator reflects the creator, and identifies whether it is directly created by the user using .Variable (None).
There is also a class that is very important to the implementation of autograd-Function. Variable and Function numbers are related to each other and establish an acyclic graph to encode the complete calculation process. Each variable has a .grad_fn attribute that references the function that created the variable (except for user-created variables whose grad_fn is None).
import torch
from torch.autograd import Variable
Create the variable x:
x = Variable(torch.ones(2, 2), requires_grad=True)
print(x)
Output result:
Variable containing:
1 1
1 1
[torch.FloatTensor of size 2x2]
Operate on the basis of x:
y = x + 2
print(y)
Output result:
Variable containing:
3 3
3 3
[torch.FloatTensor of size 2x2]
查看x的
grad_fn :
print(x.grad_fn)
Output result:
None
查看y的
grad_fn :
print(y.grad_fn)
Output result:
<torch.autograd.function.AddConstantBackward object at 0x7f603f6ab318>
可以看到y是作为运算的结果产生的,所以y有
grad_fn,而x是直接创建的,所以x没有grad_fn。
Calculate based on y:
z = y * y * 3
out = z.mean()
print(z, out)
Output result:
Variable containing:
27 27
27 27
[torch.FloatTensor of size 2x2]
Variable containing:
27
[torch.FloatTensor of size 1]
二、Gradients
如果Variable是一个标量(例如它包含一个单元素数据),你无需对backward()指定任何参数.
out.backward()
Equivalent toout.backward(torch.Tensor([1.0])).
out.backward()
print(x.grad)
Output result:
Variable containing:
4.5000 4.5000
4.5000 4.5000
[torch.FloatTensor of size 2x2]
If it has more elements (vectors), you need to specify a grad_output parameter that matches the shape of the tensor (the derivative of y projected to x in the specified direction)
x = torch.randn(3)
x = Variable(x, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
y = y * 2
print(y)
Output result:
Variable containing:
-1296.5227
499.0783
778.8971
[torch.FloatTensor of size 3]x = torch.randn(3)
x = Variable(x, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
y = y * 2
print(y)
Output result:
Variable containing:
-1296.5227
499.0783
778.8971
[torch.FloatTensor of size 3]
No parameters:
y.backward()
print(x.grad)
Output result:
RuntimeError: grad can be implicitly created only for scalar outputs
None
Incoming parameters:
gradients = torch.FloatTensor([0.1, 1.0, 0.0001])
y.backward(gradients)
print(x.grad)
Output result:
Variable containing:
102.4000
1024.0000
0.1024
[torch.FloatTensor of size 3]
Simply test the effect of different parameters:
Parameter 1: [1,1,1]
x=torch.FloatTensor([1,2,3])
x = Variable(x, requires_grad=True)
y = x * x
print(y)
gradients = torch.FloatTensor([1,1,1])
y.backward(gradients)
print(x.grad)
Output result:
Variable containing:
1
4
9
[torch.FloatTensor of size 3]
Variable containing:
2
4
6
[torch.FloatTensor of size 3]
Parameter 2: [3,2,1]
x=torch.FloatTensor([1,2,3])
x = Variable(x, requires_grad=True)
y = x * x
print(y)
gradients = torch.FloatTensor([3,2,1])
y.backward(gradients)
print(x.grad)
Output result:
Variable containing:
1
4
9
[torch.FloatTensor of size 3]
Variable containing:
6
8
6
[torch.FloatTensor of size 3]