Deep Learning Notes 1---Minist Machine Learning

Recently, I was reading a book, Mr. He Zhiyuan's "21 Projects to Play Deep Learning". It felt good, so I would like to take a note.

Minist handwriting recognition is also a tutorial for getting started with deep learning. This article has two parts, one is softamx regression, and the other is two-layer convolutional network classification.

softmax regression

Read MINIST data

# Import modules from tensorflow.examples.tutorials.mnist. This is a program that TensorFlow prepared in advance to teach MNIST
from tensorflow.examples.tutorials.mnist import input_data
# Read MNIST data from MNIST_data/. This statement will automatically execute the download when the data does not exist
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# Check the size of the training data
print(mnist.train.images.shape)  # (55000, 784)
print(mnist.train.labels.shape)  # (55000, 10)

# Check the size of the validation data
print(mnist.validation.images.shape)  # (5000, 784)
print(mnist.validation.labels.shape)  # (5000, 10)

# View the size of the test data
print(mnist.test.images.shape)  # (10000, 784)
print(mnist.test.labels.shape)  # (10000, 10)

# print out the vector representation of the 0th image
print(mnist.train.images[0, :])

# print out the label of the 0th image
print(mnist.train.labels[0, :])

There are 60,000 images in the minis object, which are divided into 55,000 training images and 5,000 validation images in TensorFlow, and are divided into three parts: minist.train is the training set, minist.validation is the validation set, and minist.test is the test set .

Save MINIST data as a picture

#coding: utf-8
from tensorflow.examples.tutorials.mnist import input_data
import scipy.misc
import them

# Read the MNIST dataset. If it does not exist, it will be downloaded in advance.
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# We save the original image in the MNIST_data/raw/ folder
# If this folder does not exist, it will be automatically created
save_dir = 'MNIST_data/raw/'
if os.path.exists(save_dir) is False:
    os.makedirs(save_dir)

# save the first 20 images
for i in range(20):
    # Please note that mnist.train.images[i, :] represents the i-th image (the serial number starts from 0)
    image_array = mnist.train.images[i, :]
    # The MNIST image in TensorFlow is a 784-dimensional vector, and we restore it to a 28x28-dimensional image.
    image_array = image_array.reshape(28, 28)
    # The format of the saved file is mnist_train_0.jpg, mnist_train_1.jpg, ... ,mnist_train_19.jpg
    filename = save_dir + 'mnist_train_%d.jpg' % i
    # save image_array as image
    # First use scipy.misc.toimage to convert to an image, and then call save to save it directly.
    scipy.misc.toimage(image_array, cmin=0.0, cmax=1.0).save(filename)

print('Please check: %s ' % save_dir)

Each image is 28×28 in size, so a single sample has 784-dimensional data

Image tag one-hot representation

# coding: utf-8
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
# Read the mnist dataset. If it does not exist, it will be downloaded in advance.
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# Look at the labels of the first 20 training images
for i in range(20):
    # Get one-hot representation, like (0, 1, 0, 0, 0, 0, 0, 0, 0, 0)
    one_hot_label = mnist.train.labels[i, :]
    # Through np.argmax we can directly get the original label
    # Because only 1 bit is 1, the others are 0
    label = np.argmax(one_hot_label)
    print('mnist_train_%d.jpg label: %d' % (i, label))

The so-called one-hot representation is "one-bit valid encoding". What does that mean? In this program, the image labels are 0~9, but we use a 10-dimensional vector to represent these 10 categories, and each category occupies a separate bit. At any time, only one bit of the one-hot representation is 1, and the others are 0. For example, 0 can be one-hot represented as (1,0,0,0,0,0,0,0,0,0), and 1 can be one-hot represented as (0,1,0,0,0,0,0, 0,0,0), 9 can be one-hot represented as (0,0,0,0,0,0,0,0,0,1). So N categories can be represented by N-dimensional arrays.

softmax regression

# coding:utf-8
# Import tensorflow.
# This sentence import tensorflow as tf is a customary practice for importing TensorFlow, please remember.
import tensorflow as tf
# Import the modules taught by MNIST
from tensorflow.examples.tutorials.mnist import input_data
# As before, read in the MNIST data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# Create x, where x is a placeholder representing the image to be recognized
x = tf.placeholder(tf.float32, [None, 784])

# W is the parameter of the Softmax model, which converts a 784-dimensional input to a 10-dimensional output
# In TensorFlow, the parameters of variables are represented by tf.Variable
W = tf.Variable(tf.zeros([784, 10]))
# b is another parameter of the Softmax model, which we generally call "bias".
b = tf.Variable(tf.zeros([10]))

# y=softmax(Wx + b), y represents the output of the model
y = tf.nn.softmax(tf.matmul(x, W) + b)

# y_ is the actual image label, again represented as a placeholder.
y_ = tf.placeholder(tf.float32, [None, 10])

# At this point, we have two important Tensors: y and y_.
# y is the output of the model, y_ is the actual image label, don't forget that y_ is the one-hot representation
# Next we will construct the loss based on y and y_

# Construct cross entropy loss according to y, y_
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y)))

# With the loss, we can use stochastic gradient descent to optimize the model's parameters (W and b)
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

# Create a Session. The optimization step train_step can only be run in a Session.
sex = tf.InteractiveSession ()
# All variables must be initialized and memory allocated before running.
tf.global_variables_initializer().run()
print('start training...')

# Do 1000 steps of gradient descent
for _ in range(2000):
    # Take 100 training data in mnist.train
    # batch_xs is the image data of shape (100, 784), batch_ys is the actual label of shape (100, 10)
    # batch_xs, batch_ys corresponds to two placeholders x and y_
    batch_xs, batch_ys = mnist.train.next_batch(100)
    # Run train_step in Session, pass in the value of the placeholder at runtime
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

# correct prediction result
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
# Calculate the prediction accuracy, they are all Tensors
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Run Tensor in Session to get the value of Tensor
# Here is the correct rate to get the final model
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))  # 0.9185

softmax is a linear multi-class classifier, derived from the logistic regression model, but the logistic regression model is a two-class model, and these two models will "score into reasonable probability values" for each category. For example, a sample may belong to three categories (a, b, c), and the softmax function can be used, and the corresponding value will be converted to (e^a/(e^a+e^b+e^c ), e^b/ (e^a+e^b+e^c ),e^c/(e^a+e^b+e^c )), that is: the probability that the sample belongs to the first class is e^a/(e ^a+e^b+e^c ), the probability of belonging to the second category is e^b/(e^a+e^b+e^c ), and the probability of belonging to the third category of samples is e^c/( e^a+e^b+e^c ), which sums exactly to 1.
Experience: 1) Use the tf.placeholder() function to create some placeholders, which are usually used to store sample data and labels, which can be regarded as formal parameters.
2) tf.Variable() creates some variables to store the parameters of the model

Two-layer convolutional network classification

# coding: utf-8
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data


def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)


def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)


def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                          strides=[1, 2, 2, 1], padding='SAME')


if __name__ == '__main__':
    # read data
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
    # x is the placeholder for the training image, y_ is the placeholder for the label of the training image
    x = tf.placeholder(tf.float32, [None, 784])
    y_ = tf.placeholder(tf.float32, [None, 10])

    # Restore a single image from a 784-dimensional vector to a 28x28 matrix image
    x_image = tf.reshape(x, [-1, 28, 28, 1])

    # first convolutional layer
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)

    # Second convolutional layer
    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)

    # Fully connected layer, the output is a 1024-dimensional vector
    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    # With Dropout, keep_prob is a placeholder, 0.5 for training and 1 for testing
    keep_prob = tf.placeholder(tf.float32)
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    # Convert the 1024-dimensional vector into 10-dimensional, corresponding to 10 categories
    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])
    y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

    # We do not use the method of first Softmax and then calculate the cross entropy, but directly use tf.nn.softmax_cross_entropy_with_logits to directly calculate
    cross_entropy = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
    # Also define train_step
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

    # define the accuracy of the test
    correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    # Create Session and variable initialization
    sex = tf.InteractiveSession ()
    sess.run(tf.global_variables_initializer())

    # train 20000 steps
    for i in range(20000):
        batch = mnist.train.next_batch(50)
        # Report accuracy on validation set every 100 steps
        if i % 100 == 0:
            train_accuracy = accuracy.eval(feed_dict={
                x: batch[0], y_: batch[1], keep_prob: 1.0})
            print("step %d, training accuracy %g" % (i, train_accuracy))
        train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

    # report the accuracy on the test set after training
    print("test accuracy %g" % accuracy.eval(feed_dict={
        x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

Four functions are defined: the weight_variable function returns a variable of a given shape and is automatically initialized with a truncated normal distribution, and bias_variable returns a variable of a given shape, their initial value is 1, and these two functions are used to create the convolution kernel. with bias.

Deep Learning Notes 1---Minist Machine Learning

Guess you like