Classic practice MNIST- neural network

MNIST handwritten numeral recognition is a classic case of the neural network entry, called deep learning community "Hello Word tasks."

This blog python language based on TensorFlow frame reproduction was carried out, and made detailed notes, hope to have references.

TF tensorflow AS Import 
from tensorflow.examples.tutorials.mnist Import input_data 

MNIST = input_data.read_data_sets ( "D: \ ClassStudy \ d ImageProcessing \ MNIST_DATA", one_hot = True) 

batch_size = 100 #BATCH size of 100, 55,000 training samples, then a total of 5500 BATCH 
learning_rate = 0.8 
learning_rate_decay = 0.999 
max_steps = the maximum number of training steps # 30000 

training_step = tf.Variable (0, trainable = False) # define a variable number of storage training wheels, which is generally non-training set, complete a batch is completed a training 

DEF hidden_layer (input_tensor, weights1, biases1, weights2, biases2, layer_name, the): 
    '' ' 
    defined above to give the hidden layer and output layer is calculated propagation using RELU () the activation function 
    ' '' 
    Layer1 tf.nn.relu = (tf.matmul (input_tensor, weights1) + biases1) 
    return tf.matmul (Layer1, weights2) + biases2

tf.placeholder = X (tf.float32, [None, 784], name = 'INPUT-X') 
Y_ = tf.placeholder (tf.float32, [None, 10], name = 'Y-Output') 

# generated hidden layer weight parameter, the generation of an array of 784 * 500, a total of 392,000 parameters, 500 is an empirical value, the actual number may 
weights1 = tf.Variable (tf.truncated_normal ([784,500], STDDEV = 0.1)) 
biases1 TF = .Variable (tf.constant (0.1, Shape = [500])) 

# weight parameters to generate the output layer, the resulting array is 10 * 500, a total of 5000 parameters 500 where the output matrix with columns for the number of hidden layer 500 corresponds to the output 10 is required to be 10 because 10 is the total of 0-9 Category 
weights2 = tf.Variable (tf.truncated_normal ([500, 10], STDDEV = 0.1)) 
biases2 = tf.Variable (tf.constant (0.1, Shape = [10])) 

# the elapsed y value before the neural network obtained in the post-propagation, the y is a matrix 10 of 
y = hidden_layer (x, weights1, biases1, weights2, biases2, 'y') 

'' ' 
in order to improve the performance of the final model employed in the test data at the train the neural network algorithm stochastic gradient descent, TensorFlow is provided for use on a moving average of the variable Method, often called moving average model 
'' '
# Train.ExponentialMovingAverage by a moving average class initialization function (), while the need to provide a parameter to a function of the attenuation, the attenuation rate of the control model updating speed. 
# Moving average algorithm will shadow variable for each variable (shadow_variable) for maintenance, the initial value is the initial value of the corresponding variable of this shadow variable. If the variable is changed, the shadow variable will be updated in accordance with certain rules. 
# Decay rate determines the update rate of moving average model, typically set to close to 1, and the larger the model becomes stable. 
= tf.train.ExponentialMovingAverage averages_class (0.99, training_step) 
# provide variable to be calculated by the moving average moving average apply function class 
averages_op = averages_class.apply (tf.trainable_variables ()) 
Moving Average class #average () function is a function that actually perform the calculation of shadow variables. In use, the calculation of its variables need to pass. 
# Y values calculated here again, using the moving average, but keep in mind moving average is just a shadow variable. 
= hidden_layer average_y (X, averages_class.average (weights1), 
                         averages_class.average (biases1), 
                         averages_class.average (weights2),
                         averages_class.average (biases2), 'average_y') 

# calculate cross-entropy loss, use of this function is applied to the input of the sample can only be classified as a certain type of situation, we are particularly suitable for this task. 
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits (logits = y, labels = tf.argmax (y_, 1)) 

weight obtained after the cross entropy #, we can calculate the weight L2 regular and a regular cross-entropy loss and loss calculation blend together total loss 
regularizer = tf.contrib.layers.l2_regularizer (from 0.0001) 
regularization = regularizer (weights1) + regularizer (weights2) 
# total loss 
loss = tf.reduce_mean (cross_entropy) + regularization 

# well determined total loss, a need optimizer . The simplest use of this principle of stochastic gradient descent optimization, a learning rate of exponential decay of the form, the optimization minimize class () function indicates the target minimized. 
learning_rate = tf.train.exponential_decay (learning_rate, training_step, mnist.train.num_examples / batch_size, learning_rate_decay)
= tf.train.GradientDescentOptimizer train_step (learning_rate) .minimize (Loss, global_step = training_step) 

# in training this model, both need each go over the data to update the parameters of the neural network through back-propagation, but also need to be updated for each parameter the moving average, control_dependencies () for performing multiple operations such disposable 
with tf.control_dependencies ([train_step, averages_op]): 
    train_op = tf.no_op (name = 'Train') 

# examined using a moving average former neural network model to spread the results are correct 
each and every array function for two tensor judgment #equal () is equal 
# to return if they are equal then true, otherwise false 
crorent_predicition = tf.equal (tf.arg_max (average_y ,. 1), tf.arg_max (Y_,. 1)) 

#cast () function prototype is cast (x, DstT, name) , where for a Boolean data type is converted to float32 
float32 of the type obtained after # data are averaged, this average rate is the correct model in this set of data 
accuracy = tf.reduce_mean (tf.cast (crorent_predicition, tf.float32)) 


'' ' 
after the above is done, you can create a session and start trained 
'' ' 
with tf.Session () AS sess: 
    # initialize the parameters 
    tf.global_variables_initializer () run ().
    # Ready to verify data 
        sess.run (train_op, feed_dict = {X: XS, Y_: YS}) 
    validate_feed = {X: mnist.validation.images, Y_: mnist.validation.labels} 
    # test data preparation
    {X = test_feed: mnist.test.images, Y_: mnist.test.labels} 
    # cycle training, the maximum number of training steps (rounds), a batch of a training 
    for I in Range (max_steps): 
        IF I 1000% == 0: 
            # moving average model calculation results on the verification data 
            # in order to obtain the percentage of the output needs to be expanded 100 times validate_accuracy 
            validate_accuracy = sess.run (Accuracy, feed_dict = validate_feed) 
            Print ( 'the After Training% STEP D (S ), Validation Accuracy ' 
                  ' the using Model iS Average% G %% '% (I, validate_accuracy * 100)) 
        # train.next_batch () function can be read from a small portion of all of the training data provided by the function parameter batch_size as a training BATCH 
        XS, YS = mnist.train.next_batch (the batch_size = 100) 
    # test data to verify accuracy of the final set, to obtain the same percentage in order to obtain an output, needs to be expanded 100 times 
    test_accuracy = sess.run (accuracy, feed_dict = test_feed) 
    Print ( 'the After Training% STEP D (S), the using Accuracy Test Average' 
          '% G Model IS %%'% (max_steps, test_accuracy * 100))

  Output:

After 0 training step(s), validation accuracyusing average model is 7.4%
After 1000 training step(s), validation accuracyusing average model is 97.82%
After 2000 training step(s), validation accuracyusing average model is 98.1%
After 3000 training step(s), validation accuracyusing average model is 98.36%
After 4000 training step(s), validation accuracyusing average model is 98.38%
After 5000 training step(s), validation accuracyusing average model is 98.48%
After 6000 training step(s), validation accuracyusing average model is 98.36%
After 7000 training step(s), validation accuracyusing average model is 98.5%
After 8000 training step(s), validation accuracyusing average model is 98.4%
After 9000 training step(s), validation accuracyusing average model is 98.52%
After 10000 training step(s), validation accuracyusing average model is 98.5%
After 11000 training step(s), validation accuracyusing average model is 98.6%
After 12000 training step(s), validation accuracyusing average model is 98.48%
After 13000 training step(s), validation accuracyusing average model is 98.56%
After 14000 training step(s), validation accuracyusing average model is 98.54%
After 15000 training step(s), validation accuracyusing average model is 98.6%
After 16000 training step(s), validation accuracyusing average model is 98.6%
After 17000 training step(s), validation accuracyusing average model is 98.62%
After 18000 training step(s), validation accuracyusing average model is 98.56%
After 19000 training step(s), validation accuracyusing average model is 98.66%
After 20000 training step(s), validation accuracyusing average model is 98.6%
After 21000 training step(s), validation accuracyusing average model is 98.7%
After 22000 training step(s), validation accuracyusing average model is 98.6%
After 23000 training step(s), validation accuracyusing average model is 98.54%
After 24000 training step(s), validation accuracyusing average model is 98.6%
After 25000 training step(s), validation accuracyusing average model is 98.64%
After 26000 training step(s), validation accuracyusing average model is 98.64%
After 27000 training step(s), validation accuracyusing average model is 98.6%
After 28000 training step(s), validation accuracyusing average model is 98.56%
After 29000 training step(s), validation accuracyusing average model is 98.52%
After 30000 training step(s), test accuracy using averagemodel is 98.4%

  

Guess you like

Origin www.cnblogs.com/masbay/p/11586874.html