Deep learning with grandpa 5: Simple use of convolutional network for mnist digit recognition

One,Foreword

Previously, we used TensorFlow to make a temperature prediction, using a fully connected network. At the same time, we also debugged and modified the online examples so that the prediction results can still be seen. In this article, we further use the CNN (convolution) network, but predicting the temperature is a bit overkill, so this article is about recognition of handwritten digits.

Handwritten digit recognition is a very classic classification problem and is a must-have for getting started. The threshold is much lower than that of cat and dog recognition (too big a cat or dog picture requires too much computing resources).

Second,Quantity preparation

Since handwritten digit recognition is too classic, TensorFlow already comes with the mnist data set, so we don't need to find data ourselves for training.

The following simple code can automatically download the mnist data set. You can see that this data set is composed of 28x28 pixel images. The training set has 60,000 images, the test set has 10,000 images, and an equal number of labels. Here x is the picture and y is the value of the picture.

Output the x data as a picture, and convert the y data to one-hot format before outputting it. One-hot is necessary for classification tasks, otherwise it will be difficult for the network to converge. If you are interested, you can try it yourself.

If we look at the numerical output of the picture, we will find that the picture 5 can actually be digitized using a digital matrix. The place where the value is 0 in the matrix is ​​pure black, the place where the value is 255 is pure white, and the place where the value is 10, 20, and 30 is the gradient gray.

3.Build a network

Use the code below to build a simple CNN network without even using an activation function.

Compared with the fully connected layer:

1) CNN is layers.Conv2D, and the fully connected layer is layers.Dense

2) The parameters are very different. There are not so many parameters to set for the fully connected layer. We will introduce the parameters in detail later.

  1. Filters Number of convolution kernels
  2. Kernel_size convolution kernel size
  3. Padding edge padding settings
  4. Input_shape input size limit

3) There is a pooling layer

4) Finally, a fully connected layer is still needed to obtain one-dimensional results.

5) The direct input is a two-dimensional array, not a one-dimensional array.

4. Training

The parameters used for training are no different from before. After training for 100 times, the convergence is quite good, and there is no obvious over-fitting.

Go, Qiyu

Here I output the recognition of the first five numbers. Since Y is in one-hot format, the format of the output result is that each "bit" outputs a value. This value describes the probability of the image value being in that bit.

Take the first number as an example:

[ 0.04167002  0.05634148 -0.03675765 -0.06590495  0.06823181 -0.0774302

   0.08698683  0.93804836 -0.06439071  0.16399711]

The meaning is that the probability of the picture number being 0 is 0.04167002, the probability of 1 is 0.05634148, and the probability of 7 is 0.93804836. So we just take the highest probability bit, which is 7.

6,period

Since we have the basis for the previous temperature prediction, we will just briefly review the code here and will go into details later.

Guess you like

Origin blog.csdn.net/weixin_40402375/article/details/130544252