tensorflow学习笔记 + 程序 (三)模型优化

1 代价函数

1.1 不同代价函数介绍

1.1.1 二次代价函数

(1).w和b的梯度与激活函数的梯度称正比,即激活函数梯度越大,w和b的大小调整的越快,训练收敛的也就越快

1.1.2 交叉熵代价函数

(1).w和b的梯度与输出值和实际值的误差成正比(含正负),误差越大,权值与偏置的梯度也就越大,参数训练的速度也就越快
(2).如果输出神经元为线性,那么二次代价函数应为不错的选择,如果激活函数为sigmoid,tanh等s型函数,那么应使用交叉熵代价函数

1.1.3 对数似然代价函数

(1).对数释然函数常用来作为softmax回归的代价函数,如果输出层神经元是sigmoid函数,可以采用
交叉熵代价函数。而深度学习中更普遍的做法是将softmax作为最后一层,此时常用的代价函数是
对数释然代价函数。
(2).对数似然代价函数与softmax的组合和交叉熵与sigmoid函数的组合非常相似。对数释然代价函数
在二分类时可以化简为交叉熵代价函数的形式。
(3).在Tensorflow中用:
tf.nn.sigmoid_cross_entropy_with_logits()来表示跟sigmoid搭配使用的交叉熵。
tf.nn.softmax_cross_entropy_with_logits()来表示跟softmax搭配使用的交叉熵。

1.2 程序

import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data

# 载入数据集
mnist = input_data.read_data_sets("MNIST_data",one_hot=True)

# 定义变量_每个批次的大小
batch_size = 100
# 计算一共多少批次
n_batch = mnist.train.num_examples // batch_size # 整除

# 定义placeholder
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])

# 创建简单的神经网络
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
prediction = tf.nn.softmax(tf.matmul(x,W) + b)

# 定义代价函数
# loss = tf.reduce_mean(tf.square(y - prediction)) # (二次)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))


# 使用梯度下降法并最小化代价函数
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

# 初始化变量
init = tf.global_variables_initializer()

# 定义求准确率的方法
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))

# 求准确率
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(21):
        for batch in range(n_batch):
            batch_xs,batch_ys = mnist.train.next_batch(batch_size)
           
            sess.run(train_step,feed_dict = {x:batch_xs,y:batch_ys})
        
        # 测试下准确率
        acc = sess.run(accuracy,feed_dict = {x:mnist.test.images,y:mnist.test.labels})
        print("Iter " + str(epoch) + ".Testing Accuracy " + str(acc))


# 输出结果(二次代价函数): 
'''
Iter 0.Testing Accuracy 0.8316
Iter 1.Testing Accuracy 0.871
Iter 2.Testing Accuracy 0.8819
Iter 3.Testing Accuracy 0.889
Iter 4.Testing Accuracy 0.8943
Iter 5.Testing Accuracy 0.8972
Iter 6.Testing Accuracy 0.9007
Iter 7.Testing Accuracy 0.9024
Iter 8.Testing Accuracy 0.9028
Iter 9.Testing Accuracy 0.9043
Iter 10.Testing Accuracy 0.9064
Iter 11.Testing Accuracy 0.9075
Iter 12.Testing Accuracy 0.9086
Iter 13.Testing Accuracy 0.9091
Iter 14.Testing Accuracy 0.9103
Iter 15.Testing Accuracy 0.911
Iter 16.Testing Accuracy 0.9112
Iter 17.Testing Accuracy 0.9126
Iter 18.Testing Accuracy 0.9128
Iter 19.Testing Accuracy 0.9128
Iter 20.Testing Accuracy 0.9142
'''
## 准确率提升至90%花费7个周期,提升到91%花费15个周期

# 输出结果(交叉熵代价函数):
'''
Iter 0.Testing Accuracy 0.825
Iter 1.Testing Accuracy 0.8767
Iter 2.Testing Accuracy 0.8999
Iter 3.Testing Accuracy 0.905   
Iter 4.Testing Accuracy 0.9084
Iter 5.Testing Accuracy 0.9099
Iter 6.Testing Accuracy 0.9117
Iter 7.Testing Accuracy 0.9139
Iter 8.Testing Accuracy 0.9151
Iter 9.Testing Accuracy 0.9163
Iter 10.Testing Accuracy 0.9174
Iter 11.Testing Accuracy 0.918
Iter 12.Testing Accuracy 0.917
Iter 13.Testing Accuracy 0.9195
Iter 14.Testing Accuracy 0.9198
Iter 15.Testing Accuracy 0.92
Iter 16.Testing Accuracy 0.9208
Iter 17.Testing Accuracy 0.9217
Iter 18.Testing Accuracy 0.9204
Iter 19.Testing Accuracy 0.9214
Iter 20.Testing Accuracy 0.9223
'''
## 准确率提升到90%只用了4个周期,提升到91%也只用了7个周期
## 可见其效果

2 拟合问题

2.1 防止过拟合的方法

(1). 增加数据集
(2). 于代价函数后增加正则项 :
在损失函数一定的情况下正则化项附加于损失函数后在神经网络优化的过程中正则化项也被优化,如果被优化的权值本就趋近于0,则在优化的过程中该权重可被看作0,而当链接一个神经元的所有权重均为0时该神经元可当作不存在,故而降低了神经网络的复杂度。
(3). Dropout:
训练时只使用部分神经元进行训练,使用时再激活全部神经元

2.2 程序(Dropout)

import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data

# 载入数据集
mnist = input_data.read_data_sets("MNIST_data",one_hot=True)

# 定义变量_每个批次的大小
batch_size = 100
# 计算一共多少批次
n_batch = mnist.train.num_examples // batch_size # 整除

# 定义placeholder
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
keep_prob = tf.placeholder(tf.float32)


# 创建简单的神经网络
W1 = tf.Variable(tf.truncated_normal([784,1000],stddev=0.1)) 
# 采用正态分布的形式初始化神经网络(其标准差为0.1)
b1 = tf.Variable(tf.zeros([1000])+0.1)
L1 = tf.nn.tanh(tf.matmul(x,W1)+b1)
L1_drop = tf.nn.dropout(L1,keep_prob)
# dropout()中参数L1为网络某一层的输出,keep_prob表示有多少神经元在工作

W2 = tf.Variable(tf.truncated_normal([1000,1000],stddev=0.1)) 
b2 = tf.Variable(tf.zeros([1000])+0.1)
L2 = tf.nn.tanh(tf.matmul(L1_drop,W2)+b2)
L2_drop = tf.nn.dropout(L2,keep_prob)

W3 = tf.Variable(tf.truncated_normal([1000,10],stddev=0.1)) 
b3 = tf.Variable(tf.zeros([10])+0.1)

prediction = tf.nn.softmax(tf.matmul(L2_drop,W3) + b3)

# 定义代价函数
# loss = tf.reduce_mean(tf.square(y - prediction)) # (二次)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))


# 使用梯度下降法并最小化代价函数
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

# 初始化变量
init = tf.global_variables_initializer()

# 定义求准确率的方法
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))

# 求准确率
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(50):
        for batch in range(n_batch):
            batch_xs,batch_ys = mnist.train.next_batch(batch_size)
           
            sess.run(train_step,feed_dict = {x:batch_xs,y:batch_ys,keep_prob:0.7})
        
        # 测试下准确率
        test_acc = sess.run(accuracy,feed_dict = {x:mnist.test.images,y:mnist.test.labels,keep_prob:0.7})
        train_acc = sess.run(accuracy,feed_dict = {x:mnist.train.images,y:mnist.train.labels,keep_prob:0.7})
        print("Iter " + str(epoch) + ".Testing Accuracy " + str(test_acc) + ".Training Accuracy " + str(train_acc))

# 输出结果(100%神经元用来训练)
'''
Iter 0.Testing Accuracy 0.934.Training Accuracy 0.9389455
Iter 1.Testing Accuracy 0.9488.Training Accuracy 0.9559818
Iter 2.Testing Accuracy 0.9557.Training Accuracy 0.9660182
Iter 3.Testing Accuracy 0.959.Training Accuracy 0.9710182
Iter 4.Testing Accuracy 0.961.Training Accuracy 0.97523636
Iter 5.Testing Accuracy 0.9661.Training Accuracy 0.97934544
Iter 6.Testing Accuracy 0.9659.Training Accuracy 0.98145455
Iter 7.Testing Accuracy 0.9687.Training Accuracy 0.9834727
Iter 8.Testing Accuracy 0.9695.Training Accuracy 0.9849455
Iter 9.Testing Accuracy 0.9701.Training Accuracy 0.9859273
Iter 10.Testing Accuracy 0.9704.Training Accuracy 0.9870727
Iter 11.Testing Accuracy 0.9722.Training Accuracy 0.98807275
Iter 12.Testing Accuracy 0.9726.Training Accuracy 0.98865455
Iter 13.Testing Accuracy 0.9728.Training Accuracy 0.9892909      
Iter 14.Testing Accuracy 0.9741.Training Accuracy 0.98981816     
Iter 15.Testing Accuracy 0.9739.Training Accuracy 0.9902727      
Iter 16.Testing Accuracy 0.9745.Training Accuracy 0.99061817     
Iter 17.Testing Accuracy 0.9738.Training Accuracy 0.99105453     
Iter 18.Testing Accuracy 0.9743.Training Accuracy 0.9913091      
Iter 19.Testing Accuracy 0.9747.Training Accuracy 0.9915818      
Iter 20.Testing Accuracy 0.9749.Training Accuracy 0.9917455 
'''# 用原训练数据进行验证准确率可达99%以上


# 输出结果(70%神经元用来训练)
'''
Iter 0.Testing Accuracy 0.8999.Training Accuracy 0.8943091
Iter 1.Testing Accuracy 0.9133.Training Accuracy 0.9114182
Iter 2.Testing Accuracy 0.9248.Training Accuracy 0.921
Iter 3.Testing Accuracy 0.9275.Training Accuracy 0.92603636
Iter 4.Testing Accuracy 0.9299.Training Accuracy 0.9328182
Iter 5.Testing Accuracy 0.9352.Training Accuracy 0.93603635
Iter 6.Testing Accuracy 0.9362.Training Accuracy 0.9398
Iter 7.Testing Accuracy 0.9399.Training Accuracy 0.9414545
Iter 8.Testing Accuracy 0.9408.Training Accuracy 0.9453273
Iter 9.Testing Accuracy 0.9455.Training Accuracy 0.94692725
Iter 10.Testing Accuracy 0.9454.Training Accuracy 0.9490909
Iter 11.Testing Accuracy 0.949.Training Accuracy 0.9498
Iter 12.Testing Accuracy 0.9472.Training Accuracy 0.9533455
Iter 13.Testing Accuracy 0.9483.Training Accuracy 0.9550545
Iter 15.Testing Accuracy 0.9508.Training Accuracy 0.9573454
Iter 16.Testing Accuracy 0.9516.Training Accuracy 0.9579273
Iter 17.Testing Accuracy 0.9544.Training Accuracy 0.95985454
Iter 18.Testing Accuracy 0.9537.Training Accuracy 0.9596182
Iter 19.Testing Accuracy 0.9531.Training Accuracy 0.96236366
Iter 20.Testing Accuracy 0.958.Training Accuracy 0.96352726
'''

ps :
(1).使用dropout()将第二个参数从1.0降至0.7后模型收敛速度将会变慢但其优化效果的上限会有所提升
(2).使用比较少的数据量训练比较复杂的网络模型时dropout的效果较为显著

3 优化器

3.1 优化器对比

(1).标准梯度下降法 :
先计算所有样本汇总误差,再根据总误差来更新权值
ps : 样本量较大的话会耗费较长时间
(2).随机梯度下降法 :
随机抽取一个样本来计算误差,然后更新权值
ps : 可能会引入较多的噪声,且优化方向不一定正确
(3).批量梯度下降法 :
批量梯度下降法相对于前两种算是一种折中的方案,从总样本中选取一个批次(比如一共有10000个样本,随机选取100个样本作为一个batch),然后计算这个batch的总误差再根据总误差来更新权值。

3.2 程序

ps:tf.train.GradientDescentOptimizer 与 tf.train.AdamOptimizer作对比

import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data

# 载入数据集
mnist = input_data.read_data_sets("MNIST_data",one_hot=True)

# 定义变量_每个批次的大小
batch_size = 100
# 计算一共多少批次
n_batch = mnist.train.num_examples // batch_size # 整除

# 定义placeholder
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])

# 创建简单的神经网络
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
prediction = tf.nn.softmax(tf.matmul(x,W) + b)

# 定义代价函数(二次)
loss = tf.reduce_mean(tf.square(y - prediction))
# 使用优化器并最小化代价函数
# train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)
train_step = tf.train.AdamOptimizer(1e-2).minimize(loss)


# 初始化变量
init = tf.global_variables_initializer()

# 定义求准确率的方法
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))


# 求准确率
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))


with tf.Session() as sess:
    sess.run(init)
    for epoch in range(21):
        for batch in range(n_batch):
            batch_xs,batch_ys = mnist.train.next_batch(batch_size)
          
            sess.run(train_step,feed_dict = {x:batch_xs,y:batch_ys})
        
        # 测试下准确率
        acc = sess.run(accuracy,feed_dict = {x:mnist.test.images,y:mnist.test.labels})
        print("Iter " + str(epoch) + ".Testing Accuracy " + str(acc))



# 输出结果(采用tf.train.GradientDescentOptimizer(0.2)作优化器)
'''
Iter 0.Testing Accuracy 0.8305
Iter 1.Testing Accuracy 0.8711
Iter 2.Testing Accuracy 0.8808
Iter 3.Testing Accuracy 0.8877
Iter 4.Testing Accuracy 0.8939
Iter 5.Testing Accuracy 0.8962
Iter 6.Testing Accuracy 0.8992
Iter 7.Testing Accuracy 0.9024
Iter 8.Testing Accuracy 0.9041
Iter 9.Testing Accuracy 0.9052
Iter 10.Testing Accuracy 0.9062
Iter 11.Testing Accuracy 0.9063
Iter 12.Testing Accuracy 0.9087
Iter 13.Testing Accuracy 0.909
Iter 14.Testing Accuracy 0.9103
Iter 15.Testing Accuracy 0.9103
Iter 16.Testing Accuracy 0.9117
Iter 17.Testing Accuracy 0.9118
Iter 18.Testing Accuracy 0.913
Iter 19.Testing Accuracy 0.9132
Iter 20.Testing Accuracy 0.9137
'''

# 输出结果(采用tf.train.AdamOptimizer(1e-2)做优化器)
'''
Iter 0.Testing Accuracy 0.9191
Iter 1.Testing Accuracy 0.922
Iter 2.Testing Accuracy 0.9234
Iter 3.Testing Accuracy 0.9249
Iter 4.Testing Accuracy 0.929
Iter 5.Testing Accuracy 0.9296
Iter 6.Testing Accuracy 0.9297
Iter 7.Testing Accuracy 0.9271
Iter 8.Testing Accuracy 0.9305
Iter 9.Testing Accuracy 0.9265
Iter 10.Testing Accuracy 0.9321
Iter 11.Testing Accuracy 0.9284
Iter 12.Testing Accuracy 0.9292
Iter 13.Testing Accuracy 0.9235
Iter 14.Testing Accuracy 0.927
Iter 15.Testing Accuracy 0.9269
Iter 16.Testing Accuracy 0.928
Iter 17.Testing Accuracy 0.928
Iter 18.Testing Accuracy 0.9285
Iter 19.Testing Accuracy 0.9289
Iter 20.Testing Accuracy 0.9284
'''

ps :
tf.train.AdamOptimizer()是利用自适应学习率的优化算法,Adam算法和随机梯度下降算法不同。随机梯度下降算法保持单一的学习率更新所有的参数,学习率在训练过程中并不会改变。而Adam算法通过计算梯度的一阶矩估计和二阶矩估计而为不同的参数设计独立的自适应性学习率,从而使模型优化变得更加智能

4 学习率

ps : 基于上一个程序进行改进,人为更新(降低)其学习率

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#载入数据集
mnist = input_data.read_data_sets("MNIST_data",one_hot=True)

#每个批次的大小
batch_size = 100
#计算一共有多少个批次
n_batch = mnist.train.num_examples // batch_size

#定义两个placeholder
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
keep_prob=tf.placeholder(tf.float32)
# 定义学习率
lr = tf.Variable(0.001, dtype=tf.float32)

#创建一个简单的神经网络
W1 = tf.Variable(tf.truncated_normal([784,500],stddev=0.1))
b1 = tf.Variable(tf.zeros([500])+0.1)
L1 = tf.nn.tanh(tf.matmul(x,W1)+b1)
L1_drop = tf.nn.dropout(L1,keep_prob) 

W2 = tf.Variable(tf.truncated_normal([500,300],stddev=0.1))
b2 = tf.Variable(tf.zeros([300])+0.1)
L2 = tf.nn.tanh(tf.matmul(L1_drop,W2)+b2)
L2_drop = tf.nn.dropout(L2,keep_prob) 

W3 = tf.Variable(tf.truncated_normal([300,10],stddev=0.1))
b3 = tf.Variable(tf.zeros([10])+0.1)
prediction = tf.nn.softmax(tf.matmul(L2_drop,W3)+b3)

#交叉熵代价函数
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))
#训练(采用Adamoptimizer()作为优化方法并最小化代价函数)
train_step = tf.train.AdamOptimizer(lr).minimize(loss)

#初始化变量
init = tf.global_variables_initializer()

#结果存放在一个布尔型列表中
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))#argmax返回一维张量中最大的值所在的位置
#求准确率
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(51):
        # 更新学习率
        sess.run(tf.assign(lr, 0.001 * (0.95 ** epoch)))
        # tf.assign(A, new_number)的功能主要是把A的值变为new_number
        # 其实际意义是将学习率逐渐减小,从而避免网络因为学习率过大而无法收敛的情况
        for batch in range(n_batch):
            batch_xs,batch_ys =  mnist.train.next_batch(batch_size)
            sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys,keep_prob:1.0})
        
        learning_rate = sess.run(lr)
        acc = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})
        print ("Iter " + str(epoch) + ", Testing Accuracy= " + str(acc) + ", Learning Rate= " + str(learning_rate))


# 输出结果
'''
Iter 0, Testing Accuracy= 0.9496, Learning Rate= 0.001
Iter 1, Testing Accuracy= 0.9631, Learning Rate= 0.00095
Iter 2, Testing Accuracy= 0.9655, Learning Rate= 0.0009025
Iter 3, Testing Accuracy= 0.9706, Learning Rate= 0.000857375
Iter 4, Testing Accuracy= 0.9727, Learning Rate= 0.00081450626
Iter 5, Testing Accuracy= 0.9734, Learning Rate= 0.0007737809
Iter 6, Testing Accuracy= 0.974, Learning Rate= 0.0007350919
Iter 7, Testing Accuracy= 0.9759, Learning Rate= 0.0006983373
Iter 8, Testing Accuracy= 0.9775, Learning Rate= 0.0006634204
Iter 9, Testing Accuracy= 0.9777, Learning Rate= 0.0006302494
Iter 10, Testing Accuracy= 0.9777, Learning Rate= 0.0005987369
Iter 11, Testing Accuracy= 0.978, Learning Rate= 0.0005688001
Iter 12, Testing Accuracy= 0.9765, Learning Rate= 0.0005403601
Iter 13, Testing Accuracy= 0.977, Learning Rate= 0.0005133421
Iter 14, Testing Accuracy= 0.9785, Learning Rate= 0.000487675
Iter 15, Testing Accuracy= 0.9784, Learning Rate= 0.00046329122
Iter 16, Testing Accuracy= 0.9783, Learning Rate= 0.00044012666
Iter 17, Testing Accuracy= 0.981, Learning Rate= 0.00041812033
Iter 18, Testing Accuracy= 0.9785, Learning Rate= 0.00039721432
Iter 19, Testing Accuracy= 0.9786, Learning Rate= 0.0003773536
Iter 20, Testing Accuracy= 0.9804, Learning Rate= 0.00035848594
Iter 21, Testing Accuracy= 0.9797, Learning Rate= 0.00034056162
Iter 22, Testing Accuracy= 0.9804, Learning Rate= 0.00032353355
Iter 23, Testing Accuracy= 0.981, Learning Rate= 0.00030735688
Iter 24, Testing Accuracy= 0.9814, Learning Rate= 0.000291989
Iter 25, Testing Accuracy= 0.9812, Learning Rate= 0.00027738957
Iter 26, Testing Accuracy= 0.9813, Learning Rate= 0.0002635201
Iter 27, Testing Accuracy= 0.9815, Learning Rate= 0.00025034408
Iter 28, Testing Accuracy= 0.9818, Learning Rate= 0.00023782688
Iter 29, Testing Accuracy= 0.982, Learning Rate= 0.00022593554
Iter 30, Testing Accuracy= 0.9819, Learning Rate= 0.00021463877
Iter 31, Testing Accuracy= 0.9819, Learning Rate= 0.00020390682
Iter 32, Testing Accuracy= 0.9813, Learning Rate= 0.00019371149
Iter 33, Testing Accuracy= 0.9819, Learning Rate= 0.0001840259
Iter 34, Testing Accuracy= 0.9803, Learning Rate= 0.00017482461
Iter 35, Testing Accuracy= 0.9815, Learning Rate= 0.00016608338
Iter 36, Testing Accuracy= 0.9814, Learning Rate= 0.00015777921
Iter 37, Testing Accuracy= 0.9808, Learning Rate= 0.00014989026
Iter 38, Testing Accuracy= 0.9818, Learning Rate= 0.00014239574
Iter 39, Testing Accuracy= 0.9818, Learning Rate= 0.00013527596
Iter 40, Testing Accuracy= 0.9814, Learning Rate= 0.00012851215
Iter 41, Testing Accuracy= 0.9809, Learning Rate= 0.00012208655
Iter 42, Testing Accuracy= 0.981, Learning Rate= 0.00011598222
Iter 43, Testing Accuracy= 0.9807, Learning Rate= 0.00011018311  
Iter 44, Testing Accuracy= 0.9808, Learning Rate= 0.000104673956 
Iter 45, Testing Accuracy= 0.9809, Learning Rate= 9.944026e-05   
Iter 46, Testing Accuracy= 0.981, Learning Rate= 9.446825e-05    
Iter 47, Testing Accuracy= 0.981, Learning Rate= 8.974483e-05    
Iter 48, Testing Accuracy= 0.9818, Learning Rate= 8.525759e-05   
Iter 49, Testing Accuracy= 0.9811, Learning Rate= 8.099471e-05   
Iter 50, Testing Accuracy= 0.9813, Learning Rate= 7.6944976e-05
'''

ps :
实际上到25次左右模型基本已收敛,但其学习率在50次时已减小到10^(-5),另外从准确率看,该结果也远高于上文不调节学习率的情况

发布了6 篇原创文章 · 获赞 0 · 访问量 233

猜你喜欢

转载自blog.csdn.net/k903161661/article/details/104232736
今日推荐