TensorFlow2.0的动态图和静态图切换 part 2

在part1中，我们学习了如何使用tf.function将python函数转换成tf的静态图，也学习了转换时创建一个状态(tf.Variable)可能出现的问题以及解决办法。

在第二部分，我们将尝试传入一个tf.Tensor，而不是tf.Variable，来确定转换是否和我们想象的一样。

tf.function使用AutoGraph

为了更清晰的说明，下面是完整的tf.function的声明：

def function(func=None,
             input_signature=None,
             autograph=True,
             experimental_autograph_options=None)

autograph参数的默认值是True，这说明tf.function确实是使用AutoGraph。下面的文档说明了True和False的区别：

当autograph为True时，所有依赖Tensor值的python代码都被上传到一个TensorFlow Graph。
当为False时，函数会被追踪，控制流不能依赖数据。

因此，tf.function默认使用AutoGraph，接下来我们会传入不同的参数，来分析AutoGraph的作用。

改变tf.Tensor输入参数类型

我们先定义一个简单的测试函数。函数的参数类型是非常重要的，因为图的创建需要一个静态类型对象，并且这个参数和函数名会生成图唯一对应的ID。我们将函数定义如下：

@tf.function
def f(x):
    print("Python execution: ", x)
    tf.print("Graph execution: ", x)
    return x

简单说明一下函数：

第一行：定义了一个函数，接受一个输入x，这个x从字面上来看可以是任意类型的值
第二行：python的print函数会被执行，而且只会在函数创建期间执行一次
第三行：TensorFlow的print函数在每次图被调用的时候都会被执行
第四行：返回x

我们来跑一些测试用例，看是不是如我们所想：

print("##### float32 test #####")
a = tf.constant(1, dtype=tf.float32)
print("first call")
f(a)
a = tf.constant(1.1, dtype=tf.float32)
print("second call")
f(a)

print("##### uint8 test #####")

b = tf.constant(2, dtype=tf.uint8)
print("first call")
f(b)
b = tf.constant(3, dtype=tf.uint8)
print("second call")
f(b)

结果如我们所料：

##### float32 test #####
first call
Python execution:  Tensor("x:0", shape=(), dtype=float32)
Graph execution:  1
second call
Graph execution:  1.1
##### uint8 test #####
first call
Python execution:  Tensor("x:0", shape=(), dtype=uint8)
Graph execution:  2
second call
Graph execution:  3

每当传入不同类型的输入时，一个图就会被创建。我们使用tf.autograph模块来查看函数f对应的图的版本：

tf.autograph.to_code(f.python_function)

会返回f函数的图表示的字符串：

def tf__f(x):
  try:
    with ag__.function_scope('f'):
      do_return = False
      retval_ = None
      with ag__.utils.control_dependency_on_returns(ag__.converted_call(print, None, ag__.ConversionOptions(recursive=True, force_conversion=False, optional_features=ag__.Feature.ALL, internal_convert_user_code=True), ('Python execution: ', x), {})):
        tf_1, x_1 = ag__.utils.alias_tensors(tf, x)
        with ag__.utils.control_dependency_on_returns(ag__.converted_call('print', tf_1, ag__.ConversionOptions(recursive=True, force_conversion=False, optional_features=ag__.Feature.ALL, internal_convert_user_code=True), ('Graph execution: ', x_1), {})):
          x_2 = ag__.utils.alias_tensors(x_1)
          do_return = True
          retval_ = x_1
          return retval_
  except:
    ag__.rewrite_graph_construction_error(ag_source_map__)

这段代码是机器自动生成的，所以比较晦涩难懂，然而我们可以发现一些有趣的事情：在图节点中，我们可以找到关于python代码的描述，函数只在启动时执行一次，将上面代码重新整理了一下：

with ag__.utils.control_dependency_on_returns(
        ag__.converted_call(
            print, None, ag__.ConversionOptions(
                recursive=True,
                force_conversion=False,
                optional_features=ag__.Feature.ALL,
                internal_convert_user_code=True),
            ('Python execution: ', x), {})
        ):

我们可以看到，ag__.utils.control_dependency_on_returns在converted_call创建的函数回调时创建了一个tf.control_dependency上下文。这可以保留图节点的运行顺序，强制节点按照序列的方式执行。

converted_call 函数编译了一个python函数的执行代码。converted_call函数有转换和执行python函数(这里就是print)所需的所有信息，我们可以通过分析它的声明来找到这些信息，即(f, owner, options, args, kwargs)：

f 是被调用的函数，在本例中就是print函数，在下次调用(执行图)是字符串’print’
owner 是函数的包或者宿主，在本例中是None，因为print是python的标准函数，在后续调用中是tf_1，也就是tf包的简写
options 转换的选项
args 函数f(print)的形参
kwargs 函数f(print)的实参

问题

为什么图里面的python代码只会执行一次呢？

假设:

作者认为是因为python的print函数可能引起了什么副作用导致构建图的时候把它给去掉了。不过随后作者在tf社区里面问了这个问题，得到了开发者的回答：
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-5FtUNWjc-1585625747845)(evernotecid://1CA468D3-8108-4F95-9FF0-B3384CD16BE9/appyinxiangcom/22266324/ENNote/p603?hash=7ffd3b6b18fc0ef1650b26a541bf619e)]

大意是说，在tf.function构建图的时候，只会保留那些输入是Tensor的函数，在本例中，只有tf.print()接受Tensor输入，所以保留了，而python的print函数就被忽略了，所以只执行一次；在图构建完成之后，tf.print会继续执行。

使用python基础类型作为输入

因为python只有整数、浮点和复数三种基础类型，所以我们分别给出对这三种输入类型的测试：

def printinfo(x):
  print("Type: ", type(x), " value: ", x)

print("##### int test #####")
print("first call")
a = 1
printinfo(a)
f(a)
print("second call")
b = 2
printinfo(b)
f(b)

print("##### float test #####")
print("first call")
a = 1.0
printinfo(a)
f(a)
print("second call")
b = 2.0
printinfo(b)
f(b)

print("##### complex test #####")
print("first call")
a = complex(1.0, 2.0)
printinfo(a)
f(a)
print("second call")
b = complex(2.0, 1.0)
printinfo(b)
f(b)

和我们输入Tensor时不一样，这里的结果和我们想象的有区别：


##### int test #####
first call
Type:  <class 'int'>  value:  1
Python execution:  1
Graph execution:  1

second call
Type:  <class 'int'>  value:  2
Python execution:  2
Graph execution:  2

##### float test #####
first call
Type:  <class 'float'>  value:  1.0
Graph execution:  1
second call
Type:  <class 'float'>  value:  2.0
Graph execution:  2

##### complex test #####
first call
Type:  <class 'complex'>  value:  (1+2j)
Python execution:  (1+2j)
Graph execution:  (1+2j)
second call
Type:  <class 'complex'>  value:  (2+1j)
Python execution:  (2+1j)
Graph execution:  (2+1j)

我们想象的是，每种类型对应一个图，但是实际上是每个不同的值对应一个图：

第一次调用f(1)执行了python代码，记录了它的执行，创建了一个图并执行
第二次调用f(2)又执行了一次python代码，记录了它的执行，创建了一个图并执行
第一次调用f(1.0)没有执行python代码，只执行了图
第二次调用f(2.0)也没有执行python代码，也只执行了已创建的图
第一次调用f(1+2j)，执行了python代码，记录了它的执行，创建了一个图并执行
第二次调用f(2+1j)，执行了python代码，记录了它的执行，创建了一个图并执行

这就很奇怪了！
我们再来看是不是每个不同值都对应不同的图，先做一个实验：

ret = f(1.0)
if tf.float32 == ret.dtype:
    print("f(1.0) returns float")
else:
    print("f(1.0) return ", ret)

结果是：

Graph execution:  1
f(1.0) return  tf.Tensor(1, shape=(), dtype=int32)

这就说明了问题，python把1.0和1当做同一个数，所以两个数只创建了一个图！

注意：每当输入一个不同的值的时候，@tf.function装饰的函数都会创建一个图，而且会把python代码和tf代码都执行一遍，这就使得我们想要的图转换失效了！

性能测试

下面的代码做了一个简单的性能测试：

@tf.function
def g(x):
  return x

start = time.time()
for i in tf.range(1000):
  g(i)
end = time.time()

print("tf.Tensor time elapsed: ", (end-start))

start = time.time()
for i in range(1000):
  g(i)
end = time.time()

print("Native type time elapsed: ", (end-start))

g(x)是一个函数，在第一次调用时，传入的是tf.Tensor的不同值，第二次调用，传入不同的python基础类型值，我们看运行时间：

tf.Tensor time elapsed:  0.41594886779785156
Native type time elapsed:  5.189513444900513

结论：一定要使用tf.Tensor！

如果传入的是Tensor，AutoGraph会进行优化且运行的很好，但是如果传入python的基础类型，那么对每个不同值，都会创建一个图，这个效率是非常低的！

是否真的使用了AutoGraph

这里作者做了一个实验，直接使用tf.autograph.to_code(f.python_function)来构建图，但是失败了：

ValueError during conversion: Unable to insert statement into the computation flow: it is not followed by any computation which the statement could gate.

但是原因已经解释过了
在这里插入图片描述