Tensorflow 2.0 学习(chapter 3)

文章目录

切片
算子操作
字符串操作
RaggedTensor
SparseTensor
Variable
自定义损失函数
keras.layers.Lambda
自定义Layer
TF函数与Python函数的转换
@tf.function
TensorSpec
TF函数转换成Graph
Graph
求导
求两次导数
求二阶导数
Optimizer与求导
Metric的使用

切片

通常用start:stop:step来切片, 用:表示当前维度下的所有数据.

省略号(…)的意思是展开尽可能多的冒号, 比如,

t = tf.constant([[1,2,3], [4,5,6]])
t[...] == t[:, :]
t[..., 1] == t[:, 1]
t[..., 1, 2] == t[1, 2]
t[1, ...] == t[1, :]

None与省略号组合表示扩展一个维度, 例如t[None, ...]表示将数组扩展为1×2×3的数组

算子操作

t = tf.constant([0])
t + 10	            # 加法
tf.square(t)        # 平方
t @ tf.transpose(t) # 矩阵乘
t.numpy()           # 转换为numpy矩阵

字符串操作

t = tf.constant("cafe")
tf.strings.length(t)                 # 字符串长度
tf.strings.length(t, unit='UTF8_CHAR')
tf.strings.unicode_decode(t, 'UTF8') # 字符串转换为RaggedTensor

RaggedTensor

不等长的Tensor.

r = tf.ragged.constant([[1, 2], [], [3]])
tf.concat([r, r2], axis=0) # 在第0维度上连接
r.to_tensor() # 变成Tensor, 用0补齐

SparseTensor

稀疏的Tensor.

s = tf.SparseTensor(indices=[[0, 1], [1, 0], [2, 3]],
                   values=[1., 2., 3.],
                   dense_shape=[3, 4])
tf.sparse.to_dense(s) # 转换为普通矩阵

如果Sparse Tensor的indices不按顺序排列, 则转换为 dense tensor 的时候会出错

Variable

v = tf.Variable([[1, 2], [3, 4]])
v.value() # 返回Tensor
v.numpy() # 返回numpy数组
v.assign(2 * v) # 变量只能用assign赋值

自定义损失函数

def my_mse(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_pred - y_true))

model.compile(loss=my_mse, ...)

keras.layers.Lambda

mylayer = keras.layers.Lambda(lambda x: tf.nn.softplus(x))

model.add(mylayer)

自定义Layer的方式之一. 补充

# softmax=
tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)
# softplus=
log(exp(features) + 1)

自定义Layer

class MyDenseLayer(tf.keras.layers.Layer):
    def __init__(self, units, activation=None, **kwargs):
        self.units = units
        self.activation = keras.layers.Activation(activation)
        super().__init__(**kwargs)
    
    def build(self, input_shape):
        # 添加新变量 add_weight
        self.kernel = self.add_weight(name="kernel",   
                                      shape=(input_shape[1], self.units),
                                      initializer='uniform',    # 均匀分布
                                      trainable=True)
        self.bias = self.add_weight(name="bias",
                                    shape=(self.units,),
                                    initializer='zeros',
                                    trainable=True)
        super().build(input_shape)
    
    def call(self, inputs, **kwargs):
        return self.activation(inputs @ self.kernel + self.bias)
    
model.add(MyDenseLayer(30, activation='relu', input_shape=x_train.shape[1:]))
model.add(MyDenseLayer(1))

创建自定义Layer需要继承tf.keras.layers.Layer.

在构造函数里面, 不仅要添加自己需要的参数, 还需要**kwargs保存其它参数, 传递给父类.

要在构造函数里面创建自己需要的其它Layer.
build函数的参数input_shape是由父类传入的, 最后需要调用父类的build方法.

用self.add_weight创建变量.
call函数. 用Layer的函数式API, 或者普通的运算计算结果

TF函数与Python函数的转换

def scaled_elu(z, scale=1.0, alpha=1.0):
    return scale * tf.where(tf.greater_equal(z, 0.),    # where 相当于?:表达式
                            z, 
                            alpha * tf.nn.elu(z))

# 普通函数转换成 tf 函数
scaled_elu_tf = tf.function(scaled_elu)
# tf 函数转成普通函数
print(scaled_elu_tf.python_function is scaled_elu)

elu函数是 $\left\{\begin{matrix}e^x-1&x\le0\\x&x>0\end{matrix}\right.$ . 上面实现了python函数和tf.function相互转换.

Python函数只能在CPU上运行, 而TF函数可以在GPU上运行.

@tf.function

@tf.function
def converge_to_2(param):
    total = tf.constant(param)
    return total

相当于把Graph变成函数.
TF函数内部不能创建Variable, 但是可以访问全局变量或者通过传参的方式使用Variable.

TensorSpec

使用TensorSpec为TF函数添加输入限制:

@tf.function(input_signature=[tf.TensorSpec([None], tf.int32, name="x")])
def cube(z):
    return tf.pow(z, 3)

TensorSpec是对Tensor的描述, 包含形状、类型和名字三个参数, 名字似乎z的名称无关, 只与所在位置有关.

tf.TensorSpec(shape, dtype=tf.dtypes.float32, name=None)
# shape: Value convertible to tf.TensorShape. The shape of the tensor.
# 输入[None]应该会自动转换为tf.TensorShape(None), 表示什么都不知道

TF函数转换成Graph

首先转换成concrete_function,

cube_func_int32 = cube.get_concrete_function()
print(cube_func_int32 is cube.get_concrete_function(tf.TensorSpec([5], tf.int32))) # True
print(cube_func_int32 is cube.get_concrete_function(tf.constant([1, 2, 3])))  # True

然后访问其中的graph成员,

graph = cube_func_int32.graph

Graph

graph.get_operations()[1].inputs
graph.get_operation_by_name("x")
graph.as_graph_def() # Returns a serialized GraphDef representation of this graph.

A TensorFlow computation, represented as a dataflow graph.

Graphs are used by tf.functions to represent the function’s computations. Each graph contains a set of tf.Operation objects, which represent units of computation; and tf.Tensor objects, which represent the units of data that flow between operations.

Graph是一种TF计算. 每个图包括一些Operation对象(代表计算单元), 还包括一些Tensor对象.

求导

x1 = tf.Variable(2.0)
x2 = tf.Variable(3.0)
with tf.GradientTape() as tape: # 感觉是在 tape 的包围内的变量对包围外的变量进行求导
    z = x1 + x2

dz_x1 = tape.gradient(z, x1)         # z对x1求导
dz_x1x2 = tape.gradient(z, [x1, x2]) # 分别对两个变量求导
tape.gradient([z1, z2], x)           # 对两个式子求导并求和
print(dz_x1)

Operations are recorded if they are executed within this context manager and at least one of their inputs is being “watched”.

Trainable variables (created by tf.Variable or tf.compat.v1.get_variable, where trainable=True is default in both cases) are automatically watched. Tensors can be manually watched by invoking the watch method on this context manager.

在GradientTape之内的操作会被记录, 至少一个输入会被监视.

Trainable变量默认被监视, 还可以调用watch方法手动监视.

求两次导数

By default, the resources held by a GradientTape are released as soon as GradientTape.gradient() method is called. To compute multiple gradients over the same computation, create a persistent gradient tape. This allows multiple calls to the gradient() method as resources are released when the tape object is garbage collected. For example:

x = tf.constant(3.0)
with tf.GradientTape(persistent=True) as g:
  g.watch(x)
  y = x * x
  z = y * y
dz_dx = g.gradient(z, x)  # 108.0 (4*x^3 at x = 3)
dy_dx = g.gradient(y, x)  # 6.0
del g  # Drop the reference to the tape

求二阶导数

x1 = tf.Variable(2.0)
x2 = tf.Variable(3.0)
with tf.GradientTape(persistent=True) as tape_outer:
    with tf.GradientTape(persistent=True) as tape_inner:
        z = g(x1, x2)
    grads_inner = tape_inner.gradient(z, [x1, x2])
grads_outer = [tape_outer.gradient(grad_inner, [x1, x2]) for grad_inner in grads_inner]
print(grads_outer)
del tape_inner
del tape_outer

Optimizer与求导

x = tf.Variable(0.)
optimizer = keras.optimizers.SGD(lr=0.1)
for _ in range(100):
    with tf.GradientTape() as tape:
        z = f(x)
    dz_dx = tape.gradient(z, x)
    optimizer.apply_gradients([(dz_dx, x)]) # 根据某导数调整变量x
print(x)

Metric的使用

metric = keras.metrics.MeanSquaredError() # 均方差度量
metric([5.], [2.]) # 返回(5-2)^2=9并保存
metric([0.], [1.]) # 返回avg(9, 1)=5并保存
metric.result()    # 返回当前度量=5
metric.reset_states() # metrics会记住之前的结果, 需要每个 epoch 结束之后重置

metrics.Metric: Encapsulates(封装) metric(度量) logic and state.

Auliegay

发布了80 篇原创文章 · 获赞 22 · 访问量 5万+

私信关注