Tensorflow详解

本文转载tensorflow文章:
罗斯基白话：TensorFlow+实战系列（一）之详解Tensor与Flow:https://blog.csdn.net/roseki/article/details/70115369
罗斯基白话：TensorFlow+实战系列（二）从零构建传统神经网络:https://blog.csdn.net/roseki/article/details/70171684
罗斯基白话：TensorFlow+实战系列（三）常用损失函数与参数优化:https://blog.csdn.net/roseki/article/details/70241091
罗斯基白话：TensorFlow+实战系列（四）变量管理:https://blog.csdn.net/roseki/article/details/70832143
目录
罗斯基白话：TensorFlow + 实战系列（五）实战MNIST:https://blog.csdn.net/roseki/article/details/72571459

Tensorflow函数:
tf.truncated_normal

tf.random_normal

tf.nn.conv2d

tf.nn.max_pool

tf.reshape

tf.nn.softmax

tf.reduce_sum

tf.reduce_max，tf.reduce_mean

tf.train.Optimizer

tf.train.GradientDescentOptimizer

tf.train.AdadeltaOptimizer

tf.train.MomentumOptimizer

tf.train.AdamOptimizer
tf.truncated_normal

tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

从截断的正态分布中输出随机值。
生成的值服从具有指定平均值和标准偏差的正态分布，如果生成的值大于平均值2个标准偏差的值则丢弃重新选择。

在正态分布的曲线中，横轴区间（μ-σ，μ+σ）内的面积为68.268949%。
横轴区间（μ-2σ，μ+2σ）内的面积为95.449974%。
横轴区间（μ-3σ，μ+3σ）内的面积为99.730020%。
X落在（μ-3σ，μ+3σ）以外的概率小于千分之三，在实际问题中常认为相应的事件是不会发生的，基本上可以把区间（μ-3σ，μ+3σ）看作是随机变量X实际可能的取值区间，这称之为正态分布的“3σ”原则。
在tf.truncated_normal中如果x的取值在区间（μ-2σ，μ+2σ）之外则重新进行选择。这样保证了生成的值都在均值附近。

参数：

shape: 一维的张量，也是输出的张量。
mean: 正态分布的均值。
stddev: 正态分布的标准差。
dtype: 输出的类型。
seed: 一个整数，当设置之后，每次生成的随机数都一样。
name: 操作的名字。

tf.random_normal

tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

从正态分布中输出随机值。
参数:

shape: 一维的张量，也是输出的张量。
mean: 正态分布的均值。
stddev: 正态分布的标准差。
dtype: 输出的类型。
seed: 一个整数，当设置之后，每次生成的随机数都一样。
name: 操作的名字。

tf.nn.conv2d

tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)

tf.nn.conv2d是TensorFlow里面实现卷积的函数，这是搭建卷积神经网络比较核心的一个方法，非常重要。

除去name参数用以指定该操作的name，与方法有关的一共五个参数：

第一个参数input：指需要做卷积的输入图像，它要求是一个Tensor，具有[batch, in_height, in_width, in_channels]这样的shape，具体含义是[训练时一个batch的图片数量, 图片高度, 图片宽度, 图像通道数]，注意这是一个4维的Tensor，要求类型为float32和float64其中之一
第二个参数filter：相当于CNN中的卷积核，它要求是一个Tensor，具有[filter_height, filter_width, in_channels, out_channels]这样的shape，具体含义是[卷积核的高度，卷积核的宽度，图像通道数，卷积核个数]，要求类型与参数input相同，有一个地方需要注意，第三维in_channels，就是参数input的第四维
第三个参数strides：卷积时在图像每一维的步长，这是一个一维的向量，长度4
第四个参数padding：string类型的量，只能是"SAME","VALID"其中之一，这个值决定了不同的卷积方式
第五个参数：use_cudnn_on_gpu:bool类型，是否使用cudnn加速，默认为true

结果返回一个Tensor，这个输出，就是我们常说的feature map，shape仍然是[batch, height, width, channels]这种形式。

padding　无论取 'SAME' 还是取 'VALID', 它在 conv2d　和　max_pool 上的表现是一致的;
padding = 'SAME' 时，输出并不一定和原图size一致，但会保证覆盖原图所有像素，不会舍弃边上的莫些元素;
padding = 'VALID' 时，输出的size总比原图的size小，有时不会覆盖原图所有元素(既，可能舍弃边上的某些元素).

tf.nn.max_pool

tf.nn.max_pool(value, ksize, strides, padding, name=None)

max pooling是CNN当中的最大值池化操作，其实用法和卷积很类似。

第一个参数value：需要池化的输入，一般池化层接在卷积层后面，所以输入通常是feature map，依然是[batch, height, width, channels]这样的shape
第二个参数ksize：池化窗口的大小，取一个四维向量，一般是[1, height, width, 1]，因为我们不想在batch和channels上做池化，所以这两个维度设为了1
第三个参数strides：和卷积类似，窗口在每一个维度上滑动的步长，一般也是[1, stride,stride, 1]
第四个参数padding：和卷积类似，可以取'VALID' 或者'SAME'

返回一个Tensor，类型不变，shape仍然是[batch, height, width, channels]这种形式
tf.reshape

tf.reshape(tensor,shape,name=None)

函数的作用是将tensor变换为参数shape形式，其中的shape为一个列表形式，特殊的是列表可以实现逆序的遍历，即list(-1).-1所代表的含义是我们不用亲自去指定这一维的大小，函数会自动进行计算，但是列表中只能存在一个-1。（如果存在多个-1，就是一个存在多解的方程）
tf.nn.softmax

tf.nn.softmax(logits, axis=None, name=None, dim=None)

logits:
A non-empty Tensor. 一个非空张量
Must be one of the following types: half, float32, float64.必须是以下类型之一：half, float32, float64
axis:
The dimension softmax would be performed on. 将被执行的softmax维度
The default is -1 which indicates the last dimension.默认值是-1，表示最后一个维度。
name:
A name for the operation (optional).操作的名称（可选）。
dim:
Deprecated alias for axis. 弃用，axis的别名

通过Softmax回归，将logistic的预测二分类的概率的问题推广到了n分类的概率的问题。通过公式

可以看出当n分类的个数变为2时，Softmax回归又退化为logistic回归问题。

下面的几行代码说明一下用法

# -*- coding: utf-8 -*-
import tensorflow as tf
 
A = [1.0,2.0,3.0,4.0,5.0,6.0]
 
with tf.Session() as sess:
    print(sess.run(tf.nn.softmax(A)))
 
# 输出(其中所有输出的和为1)：
# [ 0.00426978  0.01160646  0.03154963  0.08576079  0.23312201  0.63369131]

tf.reduce_sum

tf.reduce_sum(input_tensor, axis=None, keep_dims=False, name=None, reduction_indices=None)

input_tensor:表示输入
axis:表示在那个维度进行sum操作。
keep_dims:表示是否保留原始数据的维度，False相当于执行完后原始数据就会少一个维度。
reduction_indices:为了跟旧版本的兼容，现在已经不使用了。
tf.reduce_max，tf.reduce_mean

求最大值

tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)

求平均值

tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)

参数1–input_tensor:待求值的tensor。

参数2–reduction_indices:在哪一维上求解。

参数（3）（4）可忽略
tf.train.Optimizer

优化器（optimizers）类的基类。这个类定义了在训练模型的时候添加一个操作的API。你基本上不会直接使用这个类，但是你会用到他的子类比如GradientDescentOptimizer, AdagradOptimizer, MomentumOptimizer.等等这些。
后面讲的时候会详细讲一下GradientDescentOptimizer 这个类的一些函数，然后其他的类只会讲构造函数，因为类中剩下的函数都是大同小异的。
tf.train.GradientDescentOptimizer

init(learning_rate, use_locking=False,name=’GradientDescent’)

作用：创建一个梯度下降优化器对象
参数：

learning_rate: A Tensor or a floating point value. 要使用的学习率 
use_locking: 要是True的话，就对于更新操作（update operations.）使用锁 
name: 名字，可选，默认是”GradientDescent”.

compute_gradients(loss,var_list=None,gate_gradients=GATE_OP,aggregation_method=None,colocate_gradients_with_ops=False,grad_loss=None)

作用：对于在变量列表（var_list）中的变量计算对于损失函数的梯度,这个函数返回一个（梯度，变量）对的列表，其中梯度就是相对应变量的梯度了。这是minimize()函数的第一个部分，
参数：

loss: 待减小的值 
var_list: 默认是在GraphKey.TRAINABLE_VARIABLES. 
gate_gradients: How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH. 
aggregation_method: Specifies the method used to combine gradient terms. Valid values are defined in the class AggregationMethod. 
colocate_gradients_with_ops: If True, try colocating gradients with the corresponding op. 
grad_loss: Optional. A Tensor holding the gradient computed for loss.

apply_gradients(grads_and_vars,global_step=None,name=None)

作用：把梯度“应用”（Apply）到变量上面去。其实就是按照梯度下降的方式加到上面去。这是minimize（）函数的第二个步骤。返回一个应用的操作。
参数:

grads_and_vars: compute_gradients()函数返回的(gradient, variable)对的列表 
global_step: Optional Variable to increment by one after the variables have been updated. 
name: 可选，名字

get_name()
 
minimize(loss,global_step=None,var_list=None,gate_gradients=GATE_OP,aggregation_method=None,colocate_gradients_with_ops=False,name=None,grad_loss=None)

作用：非常常用的一个函数
通过更新var_list来减小loss，这个函数就是前面compute_gradients() 和apply_gradients().的结合
tf.train.AdadeltaOptimizer

tf.train.AdagradOptimizer.init(learning_rate, initial_accumulator_value=0.1, use_locking=False, name=’Adagrad’)

learning_rate: A Tensor or a floating point value. The learning rate.
initial_accumulator_value: A floating point value. Starting value for the accumulators, must be positive.
use_locking: If True use locks for update operations.
name: Optional name prefix for the operations created when applying gradients. Defaults to "Adagrad".

tf.train.MomentumOptimizer

tf.train.MomentumOptimizer.init(learning_rate, momentum, use_locking=False, name=’Momentum’, use_nesterov=False)

参数:

learning_rate: A Tensor or a floating point value. The learning rate. 
momentum: A Tensor or a floating point value. The momentum. 
use_locking: If True use locks for update operations. 
name: Optional name prefix for the operations created when applying gradients. Defaults to “Momentum”.

tf.train.AdamOptimizer

tf.train.AdamOptimizer.init(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, use_locking=False, name=’Adam’)

实现了Adam算法的优化器

参数:

learning_rate: A Tensor or a floating point value. The learning rate. 
beta1: A float value or a constant float tensor. The exponential decay rate for the 1st moment estimates. 
beta2: A float value or a constant float tensor. The exponential decay rate for the 2nd moment estimates. 
epsilon: A small constant for numerical stability. 
use_locking: If True use locks for update operations. 
name: Optional name for the operations created when applying gradients. Defaults to “Adam”.

猜你喜欢