Deep Learning Notes 1---Minist Machine Learning

Recently, I was reading a book, Mr. He Zhiyuan's "21 Projects to Play Deep Learning". It felt good, so I would like to take a note.

Minist handwriting recognition is also a tutorial for getting started with deep learning. This article has two parts, one is softamx regression, and the other is two-layer convolutional network classification.

  • softmax regression

  1. Read MINIST data
  1. # Import modules from tensorflow.examples.tutorials.mnist. This is a program that TensorFlow prepared in advance to teach MNIST
    from tensorflow.examples.tutorials.mnist import input_data
    # Read MNIST data from MNIST_data/. This statement will automatically execute the download when the data does not exist
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
    
    # Check the size of the training data
    print(mnist.train.images.shape)  # (55000, 784)
    print(mnist.train.labels.shape)  # (55000, 10)
    
    # Check the size of the validation data
    print(mnist.validation.images.shape)  # (5000, 784)
    print(mnist.validation.labels.shape)  # (5000, 10)
    
    # View the size of the test data
    print(mnist.test.images.shape)  # (10000, 784)
    print(mnist.test.labels.shape)  # (10000, 10)
    
    # print out the vector representation of the 0th image
    print(mnist.train.images[0, :])
    
    # print out the label of the 0th image
    print(mnist.train.labels[0, :])

    There are 60,000 images in the minis object, which are divided into 55,000 training images and 5,000 validation images in TensorFlow, and are divided into three parts: minist.train is the training set, minist.validation is the validation set, and minist.test is the test set .

  2. Save MINIST data as a picture

    1. #coding: utf-8
      from tensorflow.examples.tutorials.mnist import input_data
      import scipy.misc
      import them
      
      # Read the MNIST dataset. If it does not exist, it will be downloaded in advance.
      mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
      
      # We save the original image in the MNIST_data/raw/ folder
      # If this folder does not exist, it will be automatically created
      save_dir = 'MNIST_data/raw/'
      if os.path.exists(save_dir) is False:
          os.makedirs(save_dir)
      
      # save the first 20 images
      for i in range(20):
          # Please note that mnist.train.images[i, :] represents the i-th image (the serial number starts from 0)
          image_array = mnist.train.images[i, :]
          # The MNIST image in TensorFlow is a 784-dimensional vector, and we restore it to a 28x28-dimensional image.
          image_array = image_array.reshape(28, 28)
          # The format of the saved file is mnist_train_0.jpg, mnist_train_1.jpg, ... ,mnist_train_19.jpg
          filename = save_dir + 'mnist_train_%d.jpg' % i
          # save image_array as image
          # First use scipy.misc.toimage to convert to an image, and then call save to save it directly.
          scipy.misc.toimage(image_array, cmin=0.0, cmax=1.0).save(filename)
      
      print('Please check: %s ' % save_dir)

    Each image is 28×28 in size, so a single sample has 784-dimensional data

  3. Image tag one-hot representation

    # coding: utf-8
    from tensorflow.examples.tutorials.mnist import input_data
    import numpy as np
    # Read the mnist dataset. If it does not exist, it will be downloaded in advance.
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
    
    # Look at the labels of the first 20 training images
    for i in range(20):
        # Get one-hot representation, like (0, 1, 0, 0, 0, 0, 0, 0, 0, 0)
        one_hot_label = mnist.train.labels[i, :]
        # Through np.argmax we can directly get the original label
        # Because only 1 bit is 1, the others are 0
        label = np.argmax(one_hot_label)
        print('mnist_train_%d.jpg label: %d' % (i, label))
    The so-called one-hot representation is "one-bit valid encoding". What does that mean? In this program, the image labels are 0~9, but we use a 10-dimensional vector to represent these 10 categories, and each category occupies a separate bit. At any time, only one bit of the one-hot representation is 1, and the others are 0. For example, 0 can be one-hot represented as (1,0,0,0,0,0,0,0,0,0), and 1 can be one-hot represented as (0,1,0,0,0,0,0, 0,0,0), 9 can be one-hot represented as (0,0,0,0,0,0,0,0,0,1). So N categories can be represented by N-dimensional arrays.
  4. softmax regression

  1. # coding:utf-8
    # Import tensorflow.
    # This sentence import tensorflow as tf is a customary practice for importing TensorFlow, please remember.
    import tensorflow as tf
    # Import the modules taught by MNIST
    from tensorflow.examples.tutorials.mnist import input_data
    # As before, read in the MNIST data
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
    
    # Create x, where x is a placeholder representing the image to be recognized
    x = tf.placeholder(tf.float32, [None, 784])
    
    # W is the parameter of the Softmax model, which converts a 784-dimensional input to a 10-dimensional output
    # In TensorFlow, the parameters of variables are represented by tf.Variable
    W = tf.Variable(tf.zeros([784, 10]))
    # b is another parameter of the Softmax model, which we generally call "bias".
    b = tf.Variable(tf.zeros([10]))
    
    # y=softmax(Wx + b), y represents the output of the model
    y = tf.nn.softmax(tf.matmul(x, W) + b)
    
    # y_ is the actual image label, again represented as a placeholder.
    y_ = tf.placeholder(tf.float32, [None, 10])
    
    # At this point, we have two important Tensors: y and y_.
    # y is the output of the model, y_ is the actual image label, don't forget that y_ is the one-hot representation
    # Next we will construct the loss based on y and y_
    
    # Construct cross entropy loss according to y, y_
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y)))
    
    # With the loss, we can use stochastic gradient descent to optimize the model's parameters (W and b)
    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
    
    # Create a Session. The optimization step train_step can only be run in a Session.
    sex = tf.InteractiveSession ()
    # All variables must be initialized and memory allocated before running.
    tf.global_variables_initializer().run()
    print('start training...')
    
    # Do 1000 steps of gradient descent
    for _ in range(2000):
        # Take 100 training data in mnist.train
        # batch_xs is the image data of shape (100, 784), batch_ys is the actual label of shape (100, 10)
        # batch_xs, batch_ys corresponds to two placeholders x and y_
        batch_xs, batch_ys = mnist.train.next_batch(100)
        # Run train_step in Session, pass in the value of the placeholder at runtime
        sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
    
    # correct prediction result
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    # Calculate the prediction accuracy, they are all Tensors
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    # Run Tensor in Session to get the value of Tensor
    # Here is the correct rate to get the final model
    print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))  # 0.9185
    
        softmax is a linear multi-class classifier, derived from the logistic regression model, but the logistic regression model is a two-class model, and these two models will "score into reasonable probability values" for each category. For example, a sample may belong to three categories (a, b, c), and the softmax function can be used, and the corresponding value will be converted to (e^a/(e^a+e^b+e^c ), e^b/ (e^a+e^b+e^c ),e^c/(e^a+e^b+e^c )), that is: the probability that the sample belongs to the first class is e^a/(e ^a+e^b+e^c ), the probability of belonging to the second category is e^b/(e^a+e^b+e^c ), and the probability of belonging to the third category of samples is e^c/( e^a+e^b+e^c ), which sums exactly to 1.
    Experience: 1) Use the tf.placeholder() function to create some placeholders, which are usually used to store sample data and labels, which can be regarded as formal parameters.
              2) tf.Variable() creates some variables to store the parameters of the model

  • Two-layer convolutional network classification
    # coding: utf-8
    import tensorflow as tf
    from tensorflow.examples.tutorials.mnist import input_data
    
    
    def weight_variable(shape):
        initial = tf.truncated_normal(shape, stddev=0.1)
        return tf.Variable(initial)
    
    
    def bias_variable(shape):
        initial = tf.constant(0.1, shape=shape)
        return tf.Variable(initial)
    
    
    def conv2d(x, W):
        return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
    
    
    def max_pool_2x2(x):
        return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                              strides=[1, 2, 2, 1], padding='SAME')
    
    
    if __name__ == '__main__':
        # read data
        mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
        # x is the placeholder for the training image, y_ is the placeholder for the label of the training image
        x = tf.placeholder(tf.float32, [None, 784])
        y_ = tf.placeholder(tf.float32, [None, 10])
    
        # Restore a single image from a 784-dimensional vector to a 28x28 matrix image
        x_image = tf.reshape(x, [-1, 28, 28, 1])
    
        # first convolutional layer
        W_conv1 = weight_variable([5, 5, 1, 32])
        b_conv1 = bias_variable([32])
        h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
        h_pool1 = max_pool_2x2(h_conv1)
    
        # Second convolutional layer
        W_conv2 = weight_variable([5, 5, 32, 64])
        b_conv2 = bias_variable([64])
        h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
        h_pool2 = max_pool_2x2(h_conv2)
    
        # Fully connected layer, the output is a 1024-dimensional vector
        W_fc1 = weight_variable([7 * 7 * 64, 1024])
        b_fc1 = bias_variable([1024])
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
        # With Dropout, keep_prob is a placeholder, 0.5 for training and 1 for testing
        keep_prob = tf.placeholder(tf.float32)
        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
    
        # Convert the 1024-dimensional vector into 10-dimensional, corresponding to 10 categories
        W_fc2 = weight_variable([1024, 10])
        b_fc2 = bias_variable([10])
        y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
    
        # We do not use the method of first Softmax and then calculate the cross entropy, but directly use tf.nn.softmax_cross_entropy_with_logits to directly calculate
        cross_entropy = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
        # Also define train_step
        train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    
        # define the accuracy of the test
        correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
        # Create Session and variable initialization
        sex = tf.InteractiveSession ()
        sess.run(tf.global_variables_initializer())
    
        # train 20000 steps
        for i in range(20000):
            batch = mnist.train.next_batch(50)
            # Report accuracy on validation set every 100 steps
            if i % 100 == 0:
                train_accuracy = accuracy.eval(feed_dict={
                    x: batch[0], y_: batch[1], keep_prob: 1.0})
                print("step %d, training accuracy %g" % (i, train_accuracy))
            train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
    
        # report the accuracy on the test set after training
        print("test accuracy %g" % accuracy.eval(feed_dict={
            x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

Four functions are defined: the weight_variable function returns a variable of a given shape and is automatically initialized with a truncated normal distribution, and bias_variable returns a variable of a given shape, their initial value is 1, and these two functions are used to create the convolution kernel. with bias.


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325770038&siteId=291194637