Continue to create, accelerate growth! This is the 5th day of my participation in the "Nuggets Daily New Plan · June Update Challenge", click to view the event details
Task and Model Analysis
In Convolutional Neural Networks ( Convolutional Neural Network
, CNN
) Fundamentals, we learned about the problems CNN
of how . In this section, we will build a CNN
model to recognize MNIST
handwritten digits. We used the following strategies to build the CNN
model :
- The input shape is
28 x 28 x 1
, the size of the convolution kernel used is3 x 3 x 1
- It should be noted that the size of the convolution kernel can be changed, but the number of channels cannot be changed, it needs to be the same as the number of input channels
10
Use convolution kernels
- After the input image goes through the convolutional layer, use the pooling layer:
- The output image size is halved
- Output obtained after flattening pooling
- The flattened layer is connected to a hidden layer
1000
with units - Finally, connect the hidden layer to the output layer, which has
10
classes (including numbers0-9
) in the output layer
After building the model, we generate 1
an , translate 1
by pixels, and test the performance of the CNN
model ; in 1
Section , we have seen that a fully connected neural network cannot strive to predict this mean The category of the image.
CNN model construction and training
Next, Keras
implement the CNN
schema defined above using the implementation to learn MNIST
how to use the CNN
model on the data.
- Load and preprocess data:
from keras.layers import Dense,Conv2D,MaxPool2D,Flatten
from keras.models import Sequential
from keras.datasets import mnist
from keras.utils import np_utils
import numpy as np
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[1], 1)
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[1], 1)
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
复制代码
The preprocessing steps are exactly the same as we used in Building Deep Feedforward Neural Networks .
- Build and compile the model:
model = Sequential()
model.add(Conv2D(10, (3, 3), input_shape=(28, 28, 1), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])
复制代码
Brief schema information for the model we initialized in the preceding code can be obtained:
model.summary()
复制代码
The brief schema information of the output model is as follows:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 10) 100
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 10) 0
_________________________________________________________________
flatten (Flatten) (None, 1690) 0
_________________________________________________________________
dense (Dense) (None, 512) 865792
_________________________________________________________________
dense_1 (Dense) (None, 10) 5130
=================================================================
Total params: 871,022
Trainable params: 871,022
Non-trainable params: 0
_________________________________________________________________
复制代码
There are 100
a parameters in the convolutional layer, because there are10
convolution kernels in the convolutional layer, so there are a total of weight parameters and bias terms (in each convolution kernel ), a total of parameters. The max pooling layer does not have any parameters as it only needs to compute the maximum value in pooling kernel of size . It can be seen that using the model can greatly reduce the amount of network parameters.3 x 3 x 1
90
10
1
100
2 x 2
CNN
- Finally, fit the model:
model.fit(x_train, y_train,
validation_data=(x_test, y_test),
epochs=10,
batch_size=1024,
verbose=1)
复制代码
The above 10
model epoch
can achieve 98%
an :
- Next, generate
1
an and shifted1
by pixels:
# 获取标签为1的所有图像输入
x_test1 = x_test[y_test[:, 1]==1]
# 利用所有标签为1的图像生成均值图像
pic = np.zeros((x_test.shape[1], x_test.shape[2]))
pic2 = np.copy(pic)
for i in range(x_test1.shape[0]):
pic2 = x_test1[i, :, :, 0]
pic = pic + pic2
pic = (pic / x_test1.shape[0])
# 将均值图像中的每个像素向左平移一个像素
for i in range(pic.shape[0]):
if i < 21:
pic[:, i] = pic[:, i+1]
# 对平移后的图像进行预测
p = model.predict(pic.reshape(1, x_test.shape[1], x_test.shape[2], 1))
print(p)
c = np.argmax(p)
print('CNN预测结果:', c)
复制代码
The resulting model output is as follows:
[[1.3430370e-05 9.9989212e-01 2.0535077e-05 2.6301734e-07 4.3278211e-05
5.9122913e-06 1.5874346e-05 6.2533190e-06 2.0079199e-06 4.1732173e-07]]
CNN预测结果: 1
复制代码
Looking at it, we can see that the predictions obtained using the CNN
architecture 1
.
Related Links
Detailed explanation of the basic concepts of convolutional neural networks