Tensorflow Note（一）

标签（空格分隔）： tensorflow笔记

卷积中的SAME和VALID

1.在进行卷积时，对于padding参数可以选择VALID或者SAME, 用实例来说明二者的不同：

SAME，即保证输出的卷积结果和输入tensor的长度和宽度保持不变。所以会进行补零操作，在原始tensor周围补零，直到输出tensor和输入tensorflow尺寸相同。

input_layer = tf.reshape(tf.linspace(start=1.0, stop=25.0, num=25), [1, 5, 5, 1])
weights = tf.ones(shape=[5, 5, 1, 1])

convolution = tf.nn.conv2d(input_layer, weights, strides=[1, 1, 1, 1], padding='SAME')

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    print(sess.run(convolution))
"""
[[[[  63.]
   [  90.]
   [ 120.]
   [ 102.]
   [  81.]]

  [[ 114.]
   [ 160.]
   [ 210.]
   [ 176.]
   [ 138.]]

  [[ 180.]
   [ 250.]
   [ 325.]
   [ 270.]
   [ 210.]]

  [[ 174.]
   [ 240.]
   [ 310.]
   [ 256.]
   [ 198.]]

  [[ 153.]
   [ 210.]
   [ 270.]
   [ 222.]
   [ 171.]]]]
"""

VALID，即不补零，只有图像中有效的元素进行运算，如果超出范围就不计算，所以输出tensor的尺寸必然小于输入tensor的尺寸。

input_layer = tf.reshape(tf.linspace(start=1.0, stop=25.0, num=25), [1, 5, 5, 1])
weights = tf.ones(shape=[5, 5, 1, 1])

convolution = tf.nn.conv2d(input_layer, weights, strides=[1, 1, 1, 1], padding='VALID')

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    print(sess.run(convolution))
"""
[[[[ 325.]]]]
"""

Tensorflow中tf.app.flags

1.tf.app.flags用于支持接收命令行传递的参数，相当于接收argv。
2.可以使用python test.py --learning_rate 0.1 --epoch_num 5 --mode test进行参数传递

import tensorflow as tf

flags = tf.app.flags
flags.DEFINE_float("learning_rate", 0.01, "learning_rate_description")
flags.DEFINE_integer("epoch_num", 10, "Number of epochs to run trainer")
flags.DEFINE_string("mode", "train", "Option mode: train, train_from_scratch, inference")

# Provides the global object that can be used to access flags.
FLAGS = flags.FLAGS


def main(_):
    print(FLAGS.learning_rate)
    print(FLAGS.epoch_num)
    print(FLAGS.mode)


if __name__ == '__main__':
    tf.app.run()

"""
0.01
10
train
"""

Tensorflow中共享变量

目的：

当我们需要共享大量变量集，并且还想在一个地方初始化所有变量时，我们就需要共享变量。

函数：

tensorflow中有两个产生变量的函数：一个是tf.get_variable()，另外一个是tf.Varibale()。通常对于新手我们都会使用后者，前者和后者的区别在于前者拥有变量检查机制，会检查已经存在的变量是否设置为共享变量，如果已经存在的变量没有设置为共享变量，tensorflow在运行到第二个拥有相同名字的变量时，会报错。

v1 = tf.get_variable('variable', shape=[1], initializer=tf.initializers.constant(0))
v2 = tf.get_variable('variable', shape=[1], initializer=tf.initializers.constant(1))

# ValueError: Variable variable already exists, disallowed.
# Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:...

v1 = tf.Variable(0, 'variable')
# name ='Variable:0'
v2 = tf.Variable(1, 'variable')
# name = 'Variable_1:0'

可以看出tensorflow如果使用后者那么将不会报错，系统会给重名变量起不同的名字，但是第一个就不会了所以报错。

假设我们要实现一个简单的卷积然后激活的函数，那么如果我们使用`tf.Variable()`，代码是下面这样的

def my_image_filter(input_images):
    conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv1_weights")
    conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
    conv1 = tf.nn.conv2d(input_images, conv1_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    relu1 = tf.nn.relu(conv1 + conv1_biases)

    conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv2_weights")
    conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
    conv2 = tf.nn.conv2d(relu1, conv2_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv2 + conv2_biases)


image1 = tf.ones(shape=[1, 255, 255, 32])
image2 = tf.zeros(shape=[1, 255, 255, 32])
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)

当我们想通过拥有同一参数的同一个过滤器来过滤两张图片，这样会产生两组变量。tensorflow中提供了变量作用域机制。

变量作用域机制在tensorflow中主要由两部分构成：

tf.get_variable(<name>, <shape>, <initializer>)：通过所给的名字创建或者返回一个变量

tf.variable_scope(<scope_name>)：通过tf.get_variable()为变量指定命名空间

def conv_relu(input, kernel_shape, bias_shape):
    weights = tf.get_variable('weights', kernel_shape, initializer=tf.random_normal_initializer())
    biases = tf.get_variable('biases', bias_shape, initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights, strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv+biases)


def my_image_filter(input_images):
    with tf.variable_scope('conv1'):
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope('conv2'):
        return conv_relu(relu1, [5, 5, 32, 32], [32])


image1 = tf.ones(shape=[1, 255, 255, 32])
image2 = tf.zeros(shape=[1, 255, 255, 32])
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)
# ValueError: Variable conv1/weights already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

如果我们想要共享变量需要通过reuse_variables()这个方法来指定

def conv_relu(input, kernel_shape, bias_shape):
    weights = tf.get_variable('weights', kernel_shape, initializer=tf.random_normal_initializer())
    biases = tf.get_variable('biases', bias_shape, initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights, strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv+biases)


def my_image_filter(input_images):
    with tf.variable_scope('conv1'):
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope('conv2'):
        return conv_relu(relu1, [5, 5, 32, 32], [32])


image1 = tf.ones(shape=[1, 255, 255, 32])
image2 = tf.zeros(shape=[1, 255, 255, 32])
with tf.variable_scope('image_filters') as scope:
    result1 = my_image_filter(image1)
    scope.reuse_variables()
    result2 = my_image_filter(image2)

下面我们看下tensorflow中变量的命名：

v1 = tf.Variable(initial_value=1.0, name='v')
print(v1.name)
v2 = tf.Variable(initial_value=1.0, name='v')
print(v2.name)
with tf.variable_scope('foo'):
    with tf.variable_scope('bar') as scope:
        v = tf.get_variable('v', [1])
        scope.reuse_variables()
        v2 = tf.get_variable('v', [1])
        v3 = tf.Variable(initial_value=1.0, name='v')
        v4 = tf.Variable(initial_value=1.0, name='v')
        print(v.name)
        print(v2.name)
        print(v3.name)
        print(v4.name)

'''
v:0
v_1:0
foo/bar/v:0
foo/bar/v:0
foo/bar/v_1:0
foo/bar/v_2:0
'''

Tensorflow中梯度函数

`gradient`函数

def gradients(ys,
              xs,
              grad_ys=None,
              name="gradients",
              colocate_gradients_with_ops=False,
              gate_gradients=False,
              aggregation_method=None,
              stop_gradients=None):
  """Constructs symbolic derivatives of sum of `ys` w.r.t. x in `xs`.

tensorflow中通常是先计算loss对所有模型中参数变量的梯度，然后将变量进行更新，所以如果我们只是获取梯度，其实并没有对变量进行更新。

`stop_gradient`函数

这个函数是用于在计算梯度时，可以屏蔽掉某个变量对梯度的影响，即计算梯度是不计算这个变量对其的贡献，同时也不会对这个变量进行更新。

def stop_gradient(input, name=None):
  r"""Stops gradient computation.

  When executed in a graph, this op outputs its input tensor as-is.

  When building ops to compute gradients, this op prevents the contribution of
  its inputs to be taken into account.  Normally, the gradient generator adds ops
  to a graph to compute the derivatives of a specified 'loss' by recursively
  finding out inputs that contributed to its computation.  If you insert this op
  in the graph it inputs are masked from the gradient generator.  They are not
  taken into account for computing gradients.

  This is useful any time you want to compute a value with TensorFlow but need
  to pretend that the value was a constant. Some examples include:

  *  The *EM* algorithm where the *M-step* should not involve backpropagation
     through the output of the *E-step*.
  *  Contrastive divergence training of Boltzmann machines where, when
     differentiating the energy function, the training must not backpropagate
     through the graph that generated the samples from the model.
  *  Adversarial training, where no backprop should happen through the adversarial
     example generation process.

  Args:
    input: A `Tensor`.
    name: A name for the operation (optional).

  Returns:
    A `Tensor`. Has the same type as `input`.
  """

代码分析

import tensorflow as tf

a = tf.constant(0.)
b = 2 * a
g = tf.gradients(a + b, [a, b], stop_gradients=[a, b])

a1 = tf.stop_gradient(tf.constant(0.))
b1 = tf.stop_gradient(2 * a1)
g1 = tf.gradients(a1+b1, [a1, b1])

a2 = tf.constant(0.)
b2 = 2 * a2
g2 = tf.gradients(a2+b2, [a2, b2])
with tf.Session() as sess:
    print(sess.run(g))
    #output:[1.0, 1.0]
    print(sess.run(g1))
    #output:[1.0, 1.0]
    print(sess.run(g2))
    #output:[3.0, 1.0]

另一份示例代码（注意这里使用的梯度下降是最简单的）

import tensorflow as tf
import numpy as np

x_data = np.linspace(start=1.0, stop=100.0, num=100)
y_data = np.linspace(start=1.0, stop=100.0, num=100)

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

w = tf.Variable(tf.random_normal([1]))
b = tf.Variable(initial_value=1.0)

# b = tf.stop_gradient(b)
predict_y = w * x + b
loss = tf.reduce_sum(tf.square(predict_y-y))
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
#gradient = optimizer.compute_gradients(loss)
#train_op = optimizer.apply_gradients(gradient)

with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    for _ in range(100):
        for index in range(len(x_data)):
            sess.run(train_op, feed_dict={x: x_data[index], y: y_data[index]})
            print(sess.run([w, b, loss], feed_dict={x: x_data[index], y: y_data[index]}))
"""
.....
[array([ nan], dtype=float32), inf, nan]
[array([ nan], dtype=float32), nan, nan]
[array([ nan], dtype=float32), nan, nan]
[array([ nan], dtype=float32), nan, nan]
[array([ nan], dtype=float32), nan, nan]
[array([ nan], dtype=float32), nan, nan]
[array([ nan], dtype=float32), nan, nan]
[array([ nan], dtype=float32), nan, nan]
"""

换了一种梯度下降方法(Adam)

import tensorflow as tf
import numpy as np

x_data = np.linspace(start=1.0, stop=100.0, num=100)
y_data = np.linspace(start=1.0, stop=100.0, num=100)

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

w = tf.Variable(tf.random_normal([1]))
b = tf.Variable(initial_value=1.0)

# b = tf.stop_gradient(b)
predict_y = w * x + b
loss = tf.reduce_sum(tf.square(predict_y-y))
train_op = tf.train.AdamOptimizer(0.01).minimize(loss)
#gradient = optimizer.compute_gradients(loss)
#train_op = optimizer.apply_gradients(gradient)

with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    for _ in range(100):
        for index in range(len(x_data)):
            sess.run(train_op, feed_dict={x: x_data[index], y: y_data[index]})
            print(sess.run([w, b, loss], feed_dict={x: x_data[index], y: y_data[index]}))
"""
.....
[array([ 1.00018632], dtype=float32), -0.01594656, 1.4347606e-06]
[array([ 1.00016069], dtype=float32), -0.015954774, 1.0142103e-06]
[array([ 1.00016844], dtype=float32), -0.015952379, 1.4901161e-08]
[array([ 1.00017369], dtype=float32), -0.015950751, 3.0174851e-07]
[array([ 1.0001514], dtype=float32), -0.015957674, 2.0354637e-06]
[array([ 1.00017929], dtype=float32), -0.015949152, 2.1012966e-06]
[array([ 1.00014257], dtype=float32), -0.01596031, 3.9651641e-06]
[array([ 1.00018048], dtype=float32), -0.015948951, 3.6964193e-06]
[array([ 1.0001328], dtype=float32), -0.015963148, 7.171242e-06]
"""

一个不同梯度下降算法的总结

Tensorflow笔记（一）

Tensorflow Note（一）

卷积中的SAME和VALID

Tensorflow中tf.app.flags

Tensorflow中共享变量

目的：

函数：

假设我们要实现一个简单的卷积然后激活的函数，那么如果我们使用`tf.Variable()`，代码是下面这样的

变量作用域机制在tensorflow中主要由两部分构成：

Tensorflow中梯度函数

`gradient`函数

`stop_gradient`函数

代码分析

另一份示例代码（注意这里使用的梯度下降是最简单的）

换了一种梯度下降方法(Adam)

猜你喜欢

Tensorflow笔记（一）

Tensorflow Note（一）

卷积中的SAME和VALID

Tensorflow中tf.app.flags

Tensorflow中共享变量

目的：

函数：

假设我们要实现一个简单的卷积然后激活的函数，那么如果我们使用tf.Variable()，代码是下面这样的

变量作用域机制在tensorflow中主要由两部分构成：

Tensorflow中梯度函数

gradient函数

stop_gradient函数

代码分析

另一份示例代码（注意这里使用的梯度下降是最简单的）

换了一种梯度下降方法(Adam)

猜你喜欢

假设我们要实现一个简单的卷积然后激活的函数，那么如果我们使用`tf.Variable()`，代码是下面这样的

`gradient`函数

`stop_gradient`函数