在TensorFlow1.0时代，采用的是静态计算图，需要先使用TensorFlow的各种算子创建计算图，然后再开启一个会话Session，显式执行计算图。
而在TensorFlow2.0时代，采用的是动态计算图，即每使用一个算子后，该算子会被动态加入到隐含的默认计算图中立即执行得到结果，而无需开启Session。
使用动态计算图即Eager Excution的好处是方便调试程序，它会让TensorFlow代码的表现和Python原生代码的表现一样，写起来就像写numpy一样，各种日志打印，控制流全部都是可以使用的。
使用动态计算图的缺点是运行效率相对会低一些。因为使用动态图会有许多次Python进程和TensorFlow的C++进程之间的通信。而静态计算图构建完成之后几乎全部在TensorFlow内核上使用C++代码执行，效率更高。此外静态图会对计算步骤进行一定的优化，剪去和结果无关的计算步骤。
如果需要在TensorFlow2.0中使用静态图，可以使用@tf.function装饰器将普通Python函数转换成对应的TensorFlow计算图构建代码。运行该函数就相当于在TensorFlow1.0中用Session执行代码。使用tf.function构建静态图的方式叫做 Autograph.

下面介绍tensorflow使用动态图和autograph的自动微分机制：
求：f(x) = a*x**2 + b*x + c的导数：
首先定义张量：

import tensorflow as tf
import numpy as np


x = tf.Variable(0.0, name='x', dtype=tf.float32)
a = tf.constant(1.0)
b = tf.constant(-2.0)
c = tf.constant(1.0)

使用tf.GradientTape()求导：

# 这里创建一个GradientTape()对象，为后面求出确定点的一阶导数准备
with tf.GradientTape() as tape:
	# 将函数表示出来：
	y = a*x**2 + b*x + c
# 使用gradient()函数，相当于是求出了一阶导数的表达式，
# 然后需要传入x值才能得出在x这个点的一阶导数值。
dy_dx = tape.gradient(y, x)
print(dy_dx)

tf.Tensor(-2.0, shape=(), dtype=float32)

对常量张量求导：需要增加watch：

with tf.GradientTape() as tape:
	tape.watch([a, b, c])
	y = a*x**2 + b*x + c

# 顺序不能弄错了：
dy_dx, dy_da, dy_db, dy_dc = tape.gradient(y, [x, a, b, c])
print(dy_da)
print(dy_dc)

tf.Tensor(0.0, shape=(), dtype=float32)
tf.Tensor(1.0, shape=(), dtype=float32)

因为dy/da = x**2, x = 0,所以dy_da = 0

求二阶导数

with tf.GradientTape() as tape1:
	with tf.GradientTape() as tape2:
		y = a * tf.pow(x, 2) + b * x + c
	dy_dx = tape2.gradient(y, x)
dy2_dx2 = tape1.gradient(dy_dx, x)
print(dy2_dx2)

tf.Tensor(2.0, shape=(), dtype=float32)

使用autograph实现：

@tf.function
def fun(x):
	a = tf.constant(1.0)
    b = tf.constant(-2.0)
    c = tf.constant(1.0)
    # 自变量转换成tf.float32类型
    x = tf.cast(x, tf.float32)
	with tf.GradientTape as tape:
		# 因为不添加watch的时候x必须是变量，但是传进去的是常量
		tape.watch(x)
		y = a * tf.pow(x, 2) + b * x + c
	dy_dx = tape.gradient(y, x)
return (dy_dx, y)

tf.print(fun(tf.constant(0.0)))

(-2, 1)

求导数的最小值：

按照前面的思路尝试一下：

定义变量：

import tensorflow as tf
import numpy as np


x = tf.Variable(0.0, name="x", dtype=tf.float32)
a = tf.constant(1.0)
b = tf.constant(-2.0)
c = tf.constant(1.0)

"""尝试着往下写，发现写不下去了，因为这种方法求的是给定一个x和abc值
求取当前的倒数值，无法求出最小值。"""
# @tf.function
# def min_value(a, b, c):
#
#     with tf.GradientTape() as tape:
#         y = a*x**2 + b*x + c
#     dy_dx = tape.gradient(y, x)

随机梯度下降实现：

这里介绍的是随机梯度下降算法，因为tensorflow的API封装性好，不了解随机梯度的人可能没那么容易看懂：

# 这里首先构建一个优化器：SGD指的是随机梯度下降
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
for _ in range(1000):
    with tf.GradientTape() as tape:
        y = a * tf.pow(x, 2) + b * x + c
    dy_dx = tape.gradient(y, x)
    # 按照前面的例子，上一步求的是函数在x处的一阶导数
    # 下面这一步的作用是利用梯度下降更新x的值：
    optimizer.apply_gradients(grads_and_vars=[(dy_dx, x)])

tf.print('y = ', y, '; x = ', x)

y =  0 ; x =  0.999998569

发现x的最终的值是无限接近于1，这里再一次透露了梯度下降法求局部最小值是采用不断逼近的思想。

使用optimizer.minimize（）

optimizer.minimize()相当于先用tape求gradient,再apply_gradient.
运行的时间和前面的方法差不多，只是用起来比较方便而已。

x = tf.Variable(0.0, name="x", dtype=tf.float32)

def f():
    a = tf.constant(1.0)
    b = tf.constant(-2.0)
    c = tf.constant(1.0)
    y = a * tf.pow(x, 2) + b * x + c
    return (y)


optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
for _ in range(1000):
	# 这一步相当于前面的with语句然后tape.gradient()
	# 其中传进去的f是函数名，不能像上面一样传入y
    optimizer.minimize(f, [x])
tf.print('y = ', f(), '; x = ', x)

y =  0 ; x =  0.999998569

autograph中完成最小值求解

使用optimizer.apply_gradients

x = tf.Variable(0.0, name="x", dtype=tf.float32)
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)


@tf.function
def minimizef():
    a = tf.constant(1.0)
    b = tf.constant(-2.0)
    c = tf.constant(1.0)
	# 注意autograph时使用tf.range(1000)而不是range(1000)
    for _ in tf.range(1000):  
        with tf.GradientTape() as tape:
            y = a * tf.pow(x, 2) + b * x + c
        dy_dx = tape.gradient(y, x)
        optimizer.apply_gradients(grads_and_vars=[(dy_dx, x)])

    y = a * tf.pow(x, 2) + b * x + c
    return y


tf.print(minimizef())
tf.print(x)

上面的循环里为什么强调要用tf.range()而不是range()，我测试了一下，如果使用tf.range()的话使用的时间是1.9秒，而换成range()之后花费的时间是37秒，根本起不到加速的作用，使用静态图的优点就是速度快。

这样看来，使用autograph好像就是将所要实现的功能放在一个函数上，然后实现。其实这是必须的，因为上面提到，要在tensorflow2.0使用静态图，就必须使用这样的装饰器的方式实现，所以创建函数是必然的。

使用optimizer.minimize

x = tf.Variable(0.0, name="x", dtype=tf.float32)
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)


@tf.function
def f():
    a = tf.constant(1.0)
    b = tf.constant(-2.0)
    c = tf.constant(1.0)
    y = a * tf.pow(x, 2) + b * x + c
    return (y)


@tf.function
def train(epoch):
    for _ in tf.range(epoch):
        optimizer.minimize(f, [x])
    return (f())


tf.print(train(1000))
tf.print(x)

wxl@&&

发布了79 篇原创文章 · 获赞 8 · 访问量 3302

私信关注

tensorflow的动态图和autograph的自动微分机制