Tensorflow study notes: Mnist handwriting training set debugging, the solution to the accuracy of 0.1 and how to increase the accuracy to more than 98%

Study notes: Mnist handwriting training set added to the hidden layer after the accuracy rate becomes 0.1 solution

Try to improve the accuracy of neural network

The first is to increase the number of hidden layers, which helps to improve the non-linearity of the results. I added an intermediate layer of 100 neurons here. The training goal is to increase the accuracy to more than 95%, which is not a high requirement. Facts have proved Adding one layer is enough.

#构建神经网络784-100-10
w_L1=tf.Variable(tf.zeros([784,100]))
b_L1=tf.Variable(tf.zeros([1,100]))
wx_plus_b_L1=tf.matmul(x,w_L1)+b_L1
l1=tf.nn.relu(wx_plus_b_L1)


#输出层
w_L2=tf.Variable(tf.zeros([100,10]))
b_L2=tf.Variable(tf.zeros([10]))
prediction=tf.nn.softmax(tf.matmul(l1,w_L2)+b_L2)

But after adding this way, no matter how the activation function is adjusted or the number of training times, the accuracy rate is 0.1135. The
reason lies in the first-level weight setting. If it is changed to a random value of the normal distribution, then the problem is solved.

Change to the weight of the normal distribution:

w_L1=tf.Variable(tf.truncated_normal(([784,100]),stddev=0.1))

Training 20 times accuracy change:
Iter20Test Accuracy 0.9314
Training 30 times accuracy change:
Iter30Test Accuracy 0.9399
If you continue to increase the number of training, the accuracy rate should reach 95%

Improve accuracy: reduce the batch size for each training

If batch_size is adjusted from 100 to 50, the accuracy rate will increase significantly.
The accuracy of training 30 times when batch_size=50.
Iter30Test Accuracy 0.9567

What if batch_size will be increased?
batch_size=200
Iter30Test Accuracy 0.9236 It
can be seen that the accuracy rate has dropped, so the smaller the batch_size, the higher the accuracy rate. This may be related to the number of optimizations. The smaller the batch data each time, the more iterations it takes to loop all the data. More, the better the optimization effect

Improve accuracy: use cross entropy

When the activation function is the sigmo function, the cross entropy can be used to obtain a better training effect. The method of using the cross entropy function in the loss function is:

tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction)

It should be noted here that when the logits parameter is passed in, the prediction value is:

prediction=tf.nn.softmax(tf.matmul(l1,w_L2)+b_L2)

If the value of softmax is used as the incoming value, the accuracy rate after training 30 times is:

Iter30Test Accuracy0.9642

But if you don’t ask for softmax and directly pass in the calculated value, what will be the effect?

prediction=tf.matmul(l1,w_L2)+b_L2

ITER 0 the Test the Accuracy 0.9336
initial accuracy rate of 0.93.
After 10 cycles of training
Iter9Test the Accuracy 0.9747

The accuracy rate was directly increased to 0.97, and the final accuracy rate remained at around 0.97.

Therefore, for cross-entropy, the incoming value of logits should be the best without processing softmax. Someone here explains because the softmax function is calculated for logits once in cross-entropy. Combined with the faster training speed when processing the L1 layer data without softmax, the latter explanation should be reliable.

Change optimizer and learning rate

First of all, the convergence speed of the stochastic gradient descent method is relatively slow. It takes many iterations to reach the lowest point, and it cannot escape the saddle low point plane for the saddle surface problem. Here are several optimizers recommended. Adadelta, Adagrad, and NAG optimizers.
Among them, Adadelta and Adagrad have the fastest convergence speed, and NAG is the optimized momentum optimizer, and the speed is also acceptable.
In TensorFlow, the methods of the three can be easily called.

train_step=tf.train.AdadeltaOptimizer(1e-3).minimize(loss)
train_step=tf.train.AdagradOptimizer(1e-3).minimize(loss)
train_step=tf.train.AdamOptimizer(1e-3).minimize(loss)

After using AdamOptimizer, the learning rate is set to the initial 0.001, and after each iteration, after the learning rate is multiplied by 0.95, training is performed 30 times, and the accuracy rate is above 98%.

lr=tf.Variable(0.001,dtype=tf.float32)

Second half of code

train_step=tf.train.AdamOptimizer(lr).minimize(loss)
    #初始变量

init=tf.global_variables_initializer()

#求准确率
with tf.name_scope('accuracy'):
    with tf.name_scope('correct_prediction'):
        correct_prediction=tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))
    with tf.name_scope('accuracy'):
        accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
        tf.summary.scalar('accuracy',accuracy)
        
#合并所有指标


#训练
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(51):
        sess.run(tf.assign(lr,0.001*0.95**(epoch)))
        for batch in range(n_batch):
            batch_xs,batch_ys=mnist.train.next_batch(batch_size)
            sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})
        learning_rate=sess.run(lr)
        test_acc=sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
        print('Iter'+str(epoch)+'Test Accuracy'+str(test_acc)+'Learning rate='+str(learning_rate))

summary

Increasing the number of neurons in a single layer (for example, 1000), or adding an intermediate layer, can improve the accuracy, but you want to increase it to more than 98% (note that the initialization of the weight and bias value is modified, what is used here is Truncated normal distribution), the loss function and optimizer must be adjusted. The loss function is modified to cross entropy to improve the accuracy. Finally, through the use of a better optimizer combined with an adaptive learning rate (which is continuously reduced in the later iterations) The learning rate) can make the loss function try to fall to the lowest point without rushing out of the best point and causing oscillation.

Guess you like

Origin blog.csdn.net/hu_hao/article/details/95535859