Tensorflow realizes automatic gradient calculation

Automatic gradient

In deep learning, we often need to find the gradient of a function. GradientTape can be used in tensorflow2.0 to automatically find the gradient.

1. Simple example

Simple example: for the function y = 2 x T xy = 2x^Txand=2 xT xfind the column vectorxxThe gradient of x .

x = tf.reshape(tf.Variable(range(4), dtype=tf.float32), (4, 1))
x

Output result:

<tf.Tensor: id=10, shape=(4, 1), dtype=float32, numpy=
array([[0.],
       [1.],
       [2.],
       [3.]], dtype=float32)>

Function y = 2 x T xy = 2x^Txand=2 xT the X-Aboutxxx gradient be4 x 4x4 x .
To achieve the gradient in tensorflow:

with tf.GradientTape() as t:
	t.watch(x)
	y = 2 * tf.matmul(tf.transpose(x), x)
dy_dx = t.gradient(y, x)
dy_dx

Output result:

<tf.Tensor: id=30, shape=(4, 1), dtype=float32, numpy=
array([[ 0.],
       [ 4.],
       [ 8.],
       [12.]], dtype=float32)>

2. Training mode and prediction mode

with tf.GradientTape(persistent=True) as g:
	g.watch(x)
	y = x * x
	z = y * y
	dz_dx = g.gradient(z, x)
	dy_dx = g.gradient(y, x)
dz_dx, dy_dz

Output result:

WARNING:tensorflow:Calling GradientTape.gradient on a persistent tape inside its context is significantly less efficient than calling it outside the context (it causes the gradient ops to be recorded on the tape, leading to increased CPU and memory usage). Only call GradientTape.gradient inside the context if you actually want to trace the gradient in order to compute higher order derivatives.
WARNING:tensorflow:Calling GradientTape.gradient on a persistent tape inside its context is significantly less efficient than calling it outside the context (it causes the gradient ops to be recorded on the tape, leading to increased CPU and memory usage). Only call GradientTape.gradient inside the context if you actually want to trace the gradient in order to compute higher order derivatives.

(<tf.Tensor: id=41, shape=(4, 1), dtype=float32, numpy=
 array([[  0.],
        [  4.],
        [ 32.],
        [108.]], dtype=float32)>,
 <tf.Tensor: id=47, shape=(4, 1), dtype=float32, numpy=
 array([[0.],
        [2.],
        [4.],
        [6.]], dtype=float32)>)

3. Find the gradient of python control flow

Even if the calculation graph of the function includes Python's control flow (such as condition and loop control), it is possible to find the gradient of the variable.
I didn't understand it here, so I won't put the code. Students in need, please click on the original link to view

Guess you like

Origin blog.csdn.net/qq_45465526/article/details/108478557