Optimizer :
tf.train.GradientDescentOptimizer
tf.train.AdadeltaOptimizer
tf.train.AdagradOptimizer
tf.train.AdagradDAOptimizer
tf.train.MomentumOptimizer
tf.train.AdamOptimizer
tf.train.FtrlOptimizer
tf.train.ProximalGradientDescentOptimizer
tf.train.ProximalAdagradOptimizer
tf.train.RMSPropOptimizer
Comparison of various optimizers:
Standard gradient descent:
Standard Gradient Descent first computes the summed error for all samples, and then updates the weights based on the total errorStochastic Gradient Descent:
Stochastic gradient descent randomly draws a sample to calculate the error, then updates the weightsBatch Gradient Descent:
Batch gradient descent is a compromise solution. Select a batch from the total samples (for example, there are 10,000 samples in total, and 100 samples are randomly selected as a batch), and then calculate the total error of this batch and update it according to the total error. weight.Code:
import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data #Load dataset #current path mnist = input_data.read_data_sets("MNISt_data", one_hot=True)
operation result:
Extracting MNISt_data/train-images-idx3-ubyte.gz Extracting MNISt_data/train-labels-idx1-ubyte.gz Extracting MNISt_data/t10k-images-idx3-ubyte.gz Extracting MNISt_data/t10k-labels-idx1-ubyte.gz
Code:
# size of each batch # put it in the form of a matrix batch_size = 100 # Calculate how many batches there are n_batch = mnist.train.num_examples // batch_size #Define two placeholders #28 x 28 = 784 x = tf.placeholder(tf.float32, [None, 784]) y = tf.placeholder(tf.float32, [None, 10]) #Create a simple neural network #input layer 784, no hidden layer, output layer 10 neurons W = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([1, 10])) prediction = tf.nn.softmax(tf.matmul(x, W) + b) #cross entropy loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=prediction)) # use gradient descent #train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss) train_step = tf.train.AdamOptimizer(1e-2).minimize(loss) #Initialize variables init = tf.global_variables_initializer() #The result is stored in a boolean list #tf.argmax(y, 1) is the same as tf.argmax(prediction, 1) and returns True, otherwise returns False #argmax returns the position of the largest value in a one-dimensional tensor correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(prediction, 1)) # find the accuracy #tf.cast(correct_prediction, tf.float32) convert boolean to float accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) with tf.Session() as sess: sess.run(init) #21 cycles in total for epoch in range(21): #Total n_batch batches for batch in range(n_batch): # get a batch batch_xs, batch_ys = mnist.train.next_batch(batch_size) sess.run(train_step, feed_dict={x:batch_xs, y:batch_ys}) #Accuracy after one cycle of training acc = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels}) print("Iter" + str(epoch) + ", Testing Accuracy" + str(acc))
operation result:
Iter0, Testing Accuracy0.9199 Iter1, Testing Accuracy0.9229 Iter2, Testing Accuracy0.9284 Iter3, Testing Accuracy0.9293 Iter4, Testing Accuracy0.9263 Iter5, Testing Accuracy0.9299 Iter6, Testing Accuracy0.9311 Iter7, Testing Accuracy0.9306 Iter8, Testing Accuracy0.9303 Iter9, Testing Accuracy0.9303 Iter10, Testing Accuracy0.9303 Iter11, Testing Accuracy0.9291 Iter12, Testing Accuracy0.9325 Iter13, Testing Accuracy0.9316 Iter14, Testing Accuracy0.9338 Iter15, Testing Accuracy0.9282 Iter16, Testing Accuracy0.9306 Iter17, Testing Accuracy0.9331 Iter18, Testing Accuracy0.9315 Iter19, Testing Accuracy0.9276 Iter20, Testing Accuracy0.9323