①softmax regression MNIST handwritten digit recognition

Softmax has a very wide range of applications in machine learning, and its calculation is simple and the effect is remarkable.

Suppose there are two numbers a and b, and a>b>c
If you take max, the result is a
If softmax is taken, then softmax(a) > softmax(b) > softmax(c), and softmax gives all options probabilities.
MNIST handwritten digit recognition is a classic case of using a softmax regression model. The softmax model can be used to assign probabilities to different objects. Even later, when we train a more refined model, the final step requires softmax to assign probabilities.
Below we use tensorflow to implement him
import tensorflow as tf

#Get the MNIST dataset
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("F:/MNIST/data/", one_hot=True)
#Use the more convenient InteractiveSession class.
# This allows for more flexibility in building code. It allows you to insert some computational graphs while running the graph.
# If you are not using InteractiveSession, then you need to build the entire computational graph before starting the session, and then start the computational graph.
sex = tf.InteractiveSession ()

#x and y are not specific values, they are just placeholders, and you can enter specific values ​​according to the placeholders when TensorFlow runs a certain calculation.
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])
#A variable represents a value in the TensorFlow calculation graph, which can be used and modified during the calculation process. In the application process of machine learning, the model parameters are generally represented by Variable.
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
#Calculate the softmax probability value of each category
y = tf.nn.softmax(tf.matmul(x,W) + b)
# The loss function is the cross-entropy between the target class and the predicted class.
#tf.reduce_sum adds up the cross-entropy values ​​of each image in the minibatch. The cross-entropy we compute refers to the entire minibatch.
cross_entropy = -tf.reduce_sum(y_*tf.log(y))

# First, start initializing all variables
init = tf.global_variables_initializer()
sess.run(init)
#The steepest descent method reduces the cross entropy with a step size of 0.01.
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
for i in range(2000):
  batch = mnist.train.next_batch(50)
  #In the calculation graph, you can use feed_dict to replace any tensor, not limited to replacing placeholders
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})
#tf.argmax can give the index value of the maximum value of the data of a tensor object in a certain dimension.
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
#Convert boolean values ​​to floats to represent true and false, then take the average.
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

The final training result is about 0.92.
Here are some explanations of the above code in the official tensorflow documentation:

①tensorflow relies on an efficient C++ backend for computation. The connection to the backend is called a session. Here we use the more convenient InteractiveSession class. Through it we can build code more conveniently. It allows us to insert some computational graphs while running the graph.
If InteractiveSession is not used, then you need to build the entire computational graph before starting the Session, and then start it.
import tensorflow as tf
sex = tf.InteractiveSession ()

②We multiply the vectorized image x by the weight matrix W, add the bias b, and then calculate the softmax probability value of each classification
y = tf.nn.softmax(tf.matmul(x,W) + b)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325481034&siteId=291194637