The automatic derivative function tf.GradientTape of tensorflow 2.1

1. Introduction

        The default calculation of TensorFlow 2.1 is eager mode. Each line of code is executed sequentially. Without the process of building a graph (and canceling the use of control_dependency), there may be a large amount of calculation, which requires a context manager (context manager ) to Connect the functions and variables that need to calculate the gradient to reduce the amount of calculation.

函数: GradientTape(persistent=False,watch_accessed_variables=True)

persistent: Boolean value, used to specify whether the newly created gradient tape is persistent. The default is False, which means that the gradient() function can only be called once.
watch_accessed_variables: Boolean value, indicating whether this gradient tape will automatically track any variables that can be trained (trainable). The default is True. If it is False, it means that you need to manually specify which variables you want to track.

Second, the correlation function

        watch function: For non-trainable variables (for example tf.constant), tape.watch()it can be used to "monitor".

        Since GradientTape only monitors variables with the traiable=True attribute (default) created by tf.Variable (variable) by default, watch is required to monitor the tensor (constant) created by the constant function, and it can also be set not to automatically monitor trainable variables. It is completely specified by yourself, just set watch_accessed_variables=False. 

        If there is no watch function, the result will be None             

watch (tensor)
role: to ensure that a tensor is tracked by tape

Parameters: tensor: a Tensor or a Tensor list.

gradient(target,sources,output_gradients=None,unconnected_gradients=tf.UnconnectedGradients.NONE)
Function: Calculate the gradient parameters of one or some tensors according to the context above the tape
: target: Differentiated Tensor or Tensor list, you can understand For the value sources after a certain function
        : Tensors or Variables list (can only have one value) can be understood as a certain variable of the function

Three, derivation steps

       1. First derivative

        code:

import tensorflow as tf
x = tf.constant(3.0)

with tf.GradientTape() as g:  # 记录求导的磁带
  g.watch(x)
  y = x * x
dy_dx = g.gradient(y, x) #求导
print(dy_dx)

        result:

        

        

        2. Second derivative

         code:

import tensorflow as tf

x = tf.constant(3.0)
with tf.GradientTape() as g:
  g.watch(x)
  with tf.GradientTape() as gg:
    gg.watch(x)
    y = x * x
  dy_dx = gg.gradient(y, x)      # y’ = 2*x = 2*3 =6
d2y_dx2 = g.gradient(dy_dx, x)  # y’’ = 2

print(dy_dx)
print(d2y_dx2)

result:

        

        3. Composite derivation

        code:

import tensorflow as tf

x = tf.constant(3.0)
# 若要多次求导需要设置函数的参数persistent=True
with tf.GradientTape(persistent=True) as g:
  g.watch(x)
  y = x * x
  z = y * y
dz_dx = g.gradient(z, x)  # z = y^2 = x^4, z’ = 4*x^3 = 4*3^3
dy_dx = g.gradient(y, x)  # y’ = 2*x = 2*3 = 6
del g  # 删除这个上下文tape

print(dy_dx)

         result:

 Note: By default, the resource of GradientTape is released after calling the gradient function, and it cannot be calculated if it is called again. If you want to derive multiple times, you need to set the parameter persistent=True of the function

reference:

https://www.cnblogs.com/SupremeBoy/p/12246528.html

Guess you like

Origin blog.csdn.net/qq_46006468/article/details/119357429