（windows10版）Tensorflow 实战Google深度学习框架学习笔记（三）

学习神经网络后的知识梳理：

从整体上看，神经网络就是一个计算框架：若干个输入-->若干个隐藏层（计算）-->若干个输出（预测）。我们想要的就是想得到“若干个隐藏层（计算）”部分的计算模型。而这个模型我们可以用现有的函数组来表示，最简单的便是使用线性函数y=Wx+b。而线性函数的组合还是线性函数，对于无法线性分割的问题束手无策。于是，想出了添加非线性函数来解决问题的思路，即添加激活函数。tensorflow提供7种激活函数，常用的激活函数有3种，如下图所示。

我们怎么确定这组(权重+偏置)函数是比较合理的呢？用各种误差函数来衡量。误差越小越好。

分类问题和回归问题是监督学习的两大种类。

反向传播使我们能够通过一定的手段(梯度下降、随机梯度下降)来修正w、b等参数。修正w、b参数后，再前向传播就会得到更小的误差函数。而训练网络(往复的前向传播、反向传播)能够逼近一个理想的最小误差，也就得到了那一个我们想要的计算模型。

逼近最小误差的过程称之为收敛。如果训练网络收敛越快、误差越小，在测试数据集表现良好，说明神经网络的设计优秀。迭代次数少、计算量小使得收敛速度快。误差小说明判断准确度高。在测试数据集表现良好，说明特征提取准确、覆盖度高。

神经网络模型的效果以及优化的目标是通过损失函数（loss function）来定义的。

loss = tf.reduce_sum(tf.where(tf.greater(y, y_), (y - y_)*loss_more, (y_ - y)*loss_less))

一个自定义损失函数的例子：

import tensorflow as tf

from numpy.random import RandomState
batch_size = 8
#输入两个节点
x = tf.placeholder(tf.float32, shape = (None,2),name = 'x-input')
#回归问题只一般只有一个输出节点
y_ = tf.placeholder(tf.float32, shape = (None,1),name = 'y-input')

#定义了一个单层的神经网络前向传播过程，这里是简单加权和
w1 = tf.Variable(tf.random_normal([2,1],stddev = 1, seed = 1))
y = tf.matmul(x, w1)

#定义预测多了和预测少了的成本
loss_less = 10
loss_more = 1
loss = tf.reduce_sum(tf.where(tf.greater(y, y_),
(y - y_)*loss_more,
(y_ - y)*loss_less))
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

#通过随机数生成一个模拟数据集
rdm = RandomState(1)
dataset_size =128
X = rdm.rand(dataset_size, 2)
#设置回归的正确值作为两个输入的和加上一个随机量。加上随机量是为了加入不可预测的噪声，
# 否则不同损失函数的意义就不大了。因为不同损失函数都会在能完全预测正确的时候最低。
#一般来噪音为一个均值为0的小量，所以这里噪音设置为-0.005-0.05的随机数
Y = [[x1 + x2 + rdm.rand()/10.0-0.05] for (x1,x2) in X]

#训练神经网络
with tf.Session() as sess:
#原书是init_op = tf.initialize_all_variables_()
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS = 5000
for i in range(STEPS):
start = (i*batch_size) % dataset_size
end = (i * batch_size) % 128 + batch_size
#原书end = min(start + batch_size , dataset_size)
sess.run(train_step,
feed_dict = {x: X[start:end],y_:Y[start:end]})
if i % 1000 == 0:
print("after %d traing step(s) ,w1 is :"%(i))
print(sess.run(w1),"\n")
print("[loss_less = 10 loss_more = 1] final w1 is :\n",sess.run(w1))
#print(sess.run(w1))

运行结果：

# 定义损失函数为MSE
loss = tf.losses.mean_squared_error(y, y_)
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS = 5000
for i in range(STEPS):
start = (i*batch_size) % 128
end = (i*batch_size) % 128 + batch_size
sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})
if i % 1000 == 0:
print("After %d training step(s), w1 is: " % (i))
print(sess.run(w1), "\n")

print("[losses.mean_squared_error]Final w1 is: \n", sess.run(w1))

运行结果：

#重新定义损失函数，使得预测多了的损失大，于是模型应该偏向少的方向预测
loss_less = 1
loss_more = 10
loss = tf.reduce_sum(tf.where(tf.greater(y, y_), (y - y_) * loss_more, (y_ - y) * loss_less))
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS = 5000
for i in range(STEPS):
start = (i*batch_size) % 128
end = (i*batch_size) % 128 + batch_size
sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})
if i % 1000 == 0:
print("After %d training step(s), w1 is: " % (i))
print(sess.run(w1), "\n")
print("[loss_less=1 loss_more=10] Final w1 is: \n", sess.run(w1))

运行结果：

（windows10版）Tensorflow 实战Google深度学习框架学习笔记（三）

猜你喜欢