Computational Graph and Dynamic Graph Mechanism

Computation graph

1. What is a calculation graph?

What is a computational graph?
insert image description here

2. Calculation graph and gradient derivation

insert image description here
Use the code to verify the example of the above picture:

import torch

w = torch.tensor([1.], requires_grad=True)
x = torch.tensor([2.], requires_grad=True)

a = torch.add(w, x)     # retain_grad()
b = torch.add(w, 1)
y = torch.mul(a, b)

y.backward()
print(w.grad)

OUT:

tensor([5.])

3. Leaf nodes

What is a leaf node?

Answer : The nodes created by users are called leaf nodes , such as X and W

is_leaf : Indicates whether the tensor is a leaf node.

insert image description here
Use the code to check which of the above calculation graphs are leaf nodes:

import torch

w = torch.tensor([1.], requires_grad=True)
x = torch.tensor([2.], requires_grad=True)

a = torch.add(w, x)     # retain_grad()
# a.retain_grad()        # 执行 retain_grad()会保留非叶子节点的梯度
b = torch.add(w, 1)
y = torch.mul(a, b)

y.backward()
# print(w.grad)

# 查看叶子结点
print("w是is_leaf ? : {}\nx是is_leaf ? : {}\na是is_leaf ? : {}\nb是is_leaf ? : {}\ny是is_leaf ? : {}\n".format(w.is_leaf, x.is_leaf, a.is_leaf, b.is_leaf, y.is_leaf))

# 查看梯度
print("\nw gradient = {}\nx gradient = {}\na gradient = {}\nb gradient = {}\n y gradient = {}\n".format(w.grad, x.grad, a.grad, b.grad, y.grad))

OUT:

w是is_leaf ? : True
x是is_leaf ? : True
a是is_leaf ? : False
b是is_leaf ? : False
y是is_leaf ? : False

w gradient = tensor([5.])
x gradient = tensor([2.])
a gradient = None
b gradient = None
y gradient = None

From the above code results of calculating the gradient of each node, we can get the role of leaf nodes:

Save memory space , why do you say that?

From the results of the above code, it can be seen that only the gradients of the leaf nodes (w and x) are saved in the memory space, so they can be output, and the derivatives of non-leaf nodes (a, b, y) are all cleared, so the output is None .

If you want to keep the gradient of non-leaf nodes: use retain_grad() to keep

retain_grad()        # 执行 retain_grad()会保留非叶子节点的梯度

grad_fn: record the method used to create the tensor

insert image description here

import torch

w = torch.tensor([1.], requires_grad=True)
x = torch.tensor([2.], requires_grad=True)

a = torch.add(w, x)     # retain_grad()
a.retain_grad()        # 执行 retain_grad()会保留非叶子节点的梯度
b = torch.add(w, 1)
y = torch.mul(a, b)

y.backward()
# print(w.grad)

# 查看叶子结点
# print("w是is_leaf ? : {}\nx是is_leaf ? : {}\na是is_leaf ? : {}\nb是is_leaf ? : {}\ny是is_leaf ? : {}\n".format(w.is_leaf, x.is_leaf, a.is_leaf, b.is_leaf, y.is_leaf))

# 查看梯度
# print("\nw gradient = {}\nx gradient = {}\na gradient = {}\nb gradient = {}\ny gradient = {}\n".format(w.grad, x.grad, a.grad, b.grad, y.grad))

# 查看 grad_fn
print("w的grad_fn: {}\nx的grad_fn: {}\na的grad_fn: {}\nb的grad_fn: {}\ny的grad_fn: {}\n".format(w.grad_fn, x.grad_fn, a.grad_fn, b.grad_fn, y.grad_fn))

OUT:

w的grad_fn: None
x的grad_fn: None
a的grad_fn: <AddBackward0 object at 0x00000188C0B7BC50>
b的grad_fn: <AddBackward0 object at 0x00000188C0B7BC88>
y的grad_fn: <MulBackward0 object at 0x00000188C0D215C0>

From the result of the above code, we can know that: grad_fn: saves the operation of each node. like

a的grad_fn: <AddBackward0 object at 0x00000188C0B7BC50>

a : It is obtained by addition and stored at the address 0x00000188C0B7BC50 .

Dynamic graphs vs static graphs

insert image description here

Static map: If we sign up for a travel team, our tour route is fixed. That is to say, we first determine the overall travel route, and then play, which is called static.

Dynamic picture: If we buy our own air tickets, we will go to Singapore first, then to Taiwan Province, and then to Japan City, but when we arrive in Singapore, someone recommends to go to Japan City first, so we decided to change the route and go to Tokyo County, Japan City first. , which is dynamic, we can adjust the model at any time.

Pytorch framework learning path (four: calculation graph and dynamic graph mechanism)

Article Directory

Computational Graph and Dynamic Graph Mechanism

Computation graph

grad_fn: record the method used to create the tensor

Dynamic graphs vs static graphs

Guess you like