深度学习 tensorflow 计算图，会话，张量

1. 计算模型---计算图：

tensorflow是一个通过计算图的形式来表示计算的编程系统，每一个计算都是计算图上的一个节点，节点之间的边描述计算之间的依赖关系。所以计算过程一般分为两个步骤：

1.定义计算图

2.执行计算

tf.Graph函数支持生成新的计算图，不同计算图上的张量和运算都不会共享：

import tensorflow as tf

graph_1 = tf.Graph()   # 生成一个图
with graph_1.as_default():
    # 在图1中定义变量并且初始化
    v = tf.get_variable(name='v', initializer=tf.zeros_initializer(), shape=[1])

graph_2 = tf.Graph()    # 生成一个图
with graph_2.as_default():
    v = tf.get_variable(name='v', initializer=tf.ones_initializer(), shape=[1])


# 在graph_1中读取变量值
with tf.Session(graph=graph_1) as sess:  # 计算图1中
    tf.global_variables_initializer().run()
    with tf.variable_scope("", reuse=True):
        print(sess.run(tf.get_variable('v')))

# 在graph_1中读取变量值
with tf.Session(graph=graph_2) as sess:  # 计算图1中
    tf.global_variables_initializer().run()
    with tf.variable_scope("", reuse=True):
        print(sess.run(tf.get_variable('v')))

计算图可以用来隔离张量和计算。还可以指定运行图中计算的设备；

with graph_1.device('/gpu:0'):
    res = a + b

在计算图中，可以通过不同的集合来管理里资源，tf.add_to_collection()函数可以将不同的资源加入到不同的集合中进行管理。tensorflow自动管理了一些集合：
1. Variables , 所有变量，持久化模型

2. Trainable Variable , 可训练的变量

3. Summary 日志相关的变量

4. Queue_runners 处理输入的变量（队列）

5. Moving Average Variables 滑动平均值

2. 数据模型--张量

张量是tensorflow管理数据的形式，但是张量实际上并不是保存的数据，而实如何得到这些数据的计算过程。张量中主要保存了三个属性：名称(name), 维度(shape), 类型(type)

若果没指定类型，tensorflow会给出默认的类型，但是为了避免潜在的风险(类型不匹配),需要指定类型

int8, int16, int32, int64, uint8

float32, float64

bool

complex64

complex128

3. 运行模型---会话

会话拥有和管理tensorflow程序运行时的所有资源，所有计算完成之后需要关闭会话来帮助系统回收资源。强烈推荐使用python的上下文管理器来使用会话。这样会避免程序因为异常而退出时未关闭会话导致资源泄露。

在tensorflow中可以对会话进行配置 tf.ConfigProto()：

graph_1 = tf.Graph()   # 生成一个图
with graph_1.as_default():
    # 在图1中定义变量并且初始化
    v = tf.get_variable(name='v', initializer=tf.zeros_initializer(), shape=[1])

# 对会话进行配置
config = tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)
# 在graph_1中读取变量值
with tf.Session(graph=graph_1, config=config) as sess:  # 计算图1中
    tf.global_variables_initializer().run()
    with tf.variable_scope("", reuse=True):
        print(sess.run(tf.get_variable('v')))

tf.ConfigProto()可以配置并行的线程数，GPU分配此策略，运算超时时间等参数。常用的参数有两个：
allow_soft_placement: 当这个参数为真时：

1. 运算无法在GPU上执行

2. 没有GPU资源

3. 运算输入包含对cpu运算结果的引用。

发生上述中的一种或多种情况，GPU上的运算可以放到CPU上执行。

因为不同GPU驱动版本对计算的支持会略有差别，通过设置这个参数为True，当某些运算无法被GPU执行时，会调整到CPU上，而不是报错。这提高了代码的可移植性。

log_device_placement为True时会记录每个节点被安排在那个设备上，方便调试。

神经网络参数与tensorflow变量：

变量tf.Variable()的作用就是更新和保存神经网络中的参数。

变量的初始化方法：
tf.random_normal 正态分布的随机数 , 参数(mean, std, dtype)

tf.truncated_normal 正态分布，如果随机出来的值如果偏离平均值超过两个标准差，则重新随机。参数(mean, std, dtype)

tf.uniform 均匀分布参数(min, max, dtype)

tf.random_gamma gamma分布参数(形状alpha, 尺度参数beta, dtype)

tf.zeros , tf.ones, tf.fill全为给定数字， tf.constant

一个前向传播的例子：

import tensorflow as tf

graph_1 = tf.Graph()   # 生成一个图
with graph_1.as_default():
    # 在图1中定义变量并且初始化
    w1 = tf.Variable(tf.random_normal(shape=[2, 3], stddev=1, dtype=tf.float32, seed=1))
    w2 = tf.Variable(tf.random_normal(shape=[3, 1], stddev=1, dtype=tf.float32, seed=1))
    x = tf.constant(value=[[0.7, 0.9]], dtype=tf.float32)

    # 前行传播
    a = tf.matmul(x, w1)
    y = tf.matmul(a, w2)

# 对会话进行配置
config = tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)
# 在graph_1中读取变量值
with tf.Session(graph=graph_1, config=config) as sess:  # 计算图1中
    # sess.run(w1.initializer)   # 初始化变量
    # sess.run(w2.initializer)
    init_op = tf.global_variables_initializer()   # 初始化算有变量  初始化所有变量，这个函数会处理变量之间的依赖关系
    sess.run(init_op)
    print(sess.run(y))

反向传播：

在神经网络的优化算法中，最常用的方法是反向传播。反向传播算法的训练神经网络的流程：

placeholder: 相当于在途中定义一个位置，作为训练数据的传入接口：

import tensorflow as tf

graph_1 = tf.Graph()   # 生成一个图
with graph_1.as_default():
    # 在图1中定义变量并且初始化
    w1 = tf.Variable(tf.random_normal(shape=[2, 3], stddev=1, dtype=tf.float32, seed=1))
    w2 = tf.Variable(tf.random_normal(shape=[3, 1], stddev=1, dtype=tf.float32, seed=1))
    x = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='input')   # placeholder的行数可以设置为None, 根据具体的传入数据再得到

    # 前行传播
    a = tf.matmul(x, w1)
    y = tf.matmul(a, w2)

# 对会话进行配置
config = tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)
# 在graph_1中读取变量值
with tf.Session(graph=graph_1, config=config) as sess:  # 计算图1中
    # sess.run(w1.initializer)   # 初始化变量
    # sess.run(w2.initializer)
    init_op = tf.global_variables_initializer()   # 初始化算有变量
    sess.run(init_op)
    feed_dict = {x: [[0.7, 0.9], [0.1, 0.4], [0.6, 0.8]]}
    print(sess.run(y, feed_dict=feed_dict))

损失函数：

简单介绍一下损失函数：

在得到一个batch的计算结果后，需要定义一个损失函数来刻画当前的预测值和真实值之间的差距，然后通过优化算法来调整神经网络参数以减小差距。例如定义交叉熵损失函数。

tensorflow常用的三种优化方法：
1. tf.train.GridientDescentOptimizer 梯度下降法

2. tf.train.AdamOptimizer

3. tf.train.MomnteumOptimizer

完整程序：

import tensorflow as tf
from numpy.random import RandomState
import numpy as np


graph_1 = tf.Graph()   # 生成一个图
with graph_1.as_default():
    # 在图1中定义变量并且初始化
    w1 = tf.Variable(tf.random_normal(shape=[2, 3], stddev=1, dtype=tf.float32, seed=1))
    w2 = tf.Variable(tf.random_normal(shape=[3, 1], stddev=1, dtype=tf.float32, seed=1))

    x = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='x-input')   # placeholder的行数可以设置为None, 根据具体的传入数据再得到
    y_ = tf.placeholder(dtype=tf.float32, shape=[None, 1], name='y-input')

    # 前行传播
    a = tf.matmul(x, w1)
    y = tf.matmul(a, w2)

    # 定义损失函数和反向传播
    y = tf.sigmoid(y)
    cross_entropy = -tf.reduce_mean(y_*tf.log(tf.clip_by_value(y, 1e-10, 1))) + (1-y)*tf.log(tf.clip_by_value(1-y, 1e-10, 1.0))
    # 通过clip_value(x, x_min. x_max)可以将x限制在合理的范围内
    train_step = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cross_entropy)    # 定义损失函数


if __name__ == "__main__":
    batch_size = 8
    rdm = RandomState(1)
    data_size = 128
    X = rdm.rand(data_size, 2)    # 训练数据
    Y = [int(x1+x2 < 1) for(x1, x2) in X]
    Y = np.array(Y)  

    print(type(X))
    print(type(Y))
    # 对会话进行配置
    config = tf.ConfigProto(allow_soft_placement=True, log_device_placement=False)
    # 在graph_1中读取变量值
    with tf.Session(graph=graph_1, config=config) as sess:  # 计算图1中
        init_op = tf.global_variables_initializer()   # 初始化算有变量
        sess.run(init_op)

        print("训练前的权值")
        print(sess.run(w1))
        print(sess.run(w2))

        steps = 1000
        for i in range(steps):
            start = (i * batch_size) % data_size
            end = min(start + batch_size, data_size)
            sess.run(train_step, feed_dict={x: X[start: end], y_: Y[start: end].reshape([-1, 1])})
            # 每隔一段时间计算在所有数据上的交叉熵
            if i % 50 == 0:
                cross_entropy_value = sess.run(cross_entropy, feed_dict={x: X, y_: Y.reshape([-1, 1])})
                print("The loss is {}".format(cross_entropy_value))
        print("训练后的权值")
        print(sess.run(w1))
        print(sess.run(w2))