Based on tensorflow and keras, the emnist data set is recognized by building a CNN convolutional neural network

 The EMNIST data set is a data set containing handwritten letters and numbers, which has the same data format as MNIST. The EMNIST Dataset | NIST

  1. Reference module introduction:
import tensorflow as tf

import mnist

from tensorflow.keras import datasets, layers, models

import numpy as np

import matplotlib.pyplot as plt

import gzip,os

It should be noted that the versions of tensorflow, keras, and numpy must correspond. If they do not correspond, they cannot be referenced normally. The python version should not be too new. 3.6 to 3.7 is the best. If the python version is not satisfactory, you can install anaconda, in anconda Create a virtual environment in the prompt, let python=3.6.5 in it

Tensorflow=2.3.1 numpy=1.19.5 keras=2.4.3 This is a version of the library that works

2.1 First import the dataset and visualize

The path is best to continue to use a relative path, the path here needs to be modified according to your own file path

The emnist dataset can be downloaded from the official website 

def load_mnist(path):
    # 放置mnist.py的目录。注意斜杠
    f = np.load(path)
    x_train, y_train = f['x_train'], f['y_train']
    x_test, y_test = f['x_test'], f['y_test']
    f.close()
    return (x_train, y_train), (x_test, y_test)
def mnist_parse_file(fname):
    fopen = gzip.open if os.path.splitext(fname)[1] == '.gz' else open
    with fopen(fname, 'rb') as fd:
        return mnist.parse_idx(fd)

train_images = mnist_parse_file(".\\Dataset\\emnist-letters-train-images-idx3-ubyte.gz")
train_labels = mnist_parse_file(".\\Dataset\\emnist-letters-train-labels-idx1-ubyte.gz")
test_images = mnist_parse_file(".\\Dataset\\emnist-letters-test-images-idx3-ubyte.gz")
test_labels = mnist_parse_file(".\\Dataset\\emnist-letters-test-labels-idx1-ubyte.gz")

Show image 6 of the training set

bfccb96f72ba4303be721bc73fe445d9.png

2.2 Neural network model

First check the size of the training set and test set to prepare for the next steps

#查看各集合大小
print(len(train_images),len(train_labels),len(test_images),len(test_labels))
print(test_images[0].shape)

08ba371f88cc428e8ca780aeb6e0e561.png

The size of the training set is 124800 and the size of the test set is 20800

Next, construct the neural network model

# 初始化序列模型   神经网络
model = models.Sequential()

# 一层隐含层,92个神经元
model.add(layers.Dense(92,input_shape=[784]))
#第二层隐含层,92个神经元,激活函数为relu
model.add(layers.Dense(92, activation='relu'))
#第三层隐含层,92个神经元
model.add(layers.Dense(92, activation='relu'))
# 输出层,对应a-z
model.add(layers.Dense(27, activation='softmax'))

#另一种创建model的方法
#model = tf.keras.models.Sequential([
#  tf.keras.layers.Dense(128, input_shape=[784]),
#  tf.keras.layers.Dense(40, activation='relu'),
#  tf.keras.layers.Dense(10, activation='softmax')
#])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.summary()

  740b9a56637c477e98240f4970fc0ed0.png

What needs attention here is a question, how to determine the number of hidden layers of the neural network and the number of neurons in each layer?

Here is an empirical formula seen on stackoverflow:

2f6f85a70e394195acb7425903a03290.png  

After calculation, the number of neurons should not exceed 157, and it is better to be around 75. However, after many experiments, it is found that the effect is better when there are 92 neurons.

The activation function uses relu, the effect is better

The output layer needs to set 27 output nodes, because the label reaches 26, and if 26 nodes are set, the accepted label range is [0,26), excluding 26.

The number of iterations can be set to 20, too many times will not improve the accuracy

x_train = train_images.reshape(-1, 784)
x_test = test_images.reshape(-1, 784)
x_train, x_test = x_train / 255.0, x_test / 255.0

history = model.fit(x_train,train_labels,epochs=20,validation_data=(x_test,test_labels))
model.save("emnist_ann.model")

Among them, x_train and x_test are divided by 255 to normalize the variables, thereby reducing the amount of calculation

The execution results are as follows:

1f6f833a9cd245dc816c068f3121de4f.png

The prediction accuracy of the final test was 89.09%

2.3 Convolutional Neural Networks

Building a Convolutional Neural Network Model

# 初始化序列模型   卷积神经网络
model = models.Sequential()

# 添加第一层卷积层,用128个3*3的卷积核,激活函数选择‘relu’,输入层(input_shape)# 是28*28(MNIST每一张图片的尺寸)后面的‘1’是图片的颜色数,MNIST是灰度图因# 此选1
model.add(layers.Conv2D(128, (3, 3), activation='relu', input_shape=(28, 28,1)))
# 加一层卷积层,64个3*3的卷积核,之后的input_shape都是自动的。
model.add(layers.Conv2D(64,(3,3), activation='relu'))
# 加一层最大池化,池化窗口2*2
model.add(layers.MaxPooling2D((2,2)))
# 加一层卷积层,32个3*3的卷积核
model.add(layers.Conv2D(32,(3,3), activation='relu'))
model.add(layers.Conv2D(16,(3,3), activation='relu'))
# 加一层最大池化,池化窗口2*2
model.add(layers.MaxPooling2D((2,2)))
#加一层卷积层,8个3*3的卷积核
model.add(layers.Conv2D(8,(3,3), activation='relu'))
# 将卷积后的矩阵展开,这就是全连接层的第一层
model.add(layers.Flatten())
# 再加一层全连接,80个神经元,激活函数为relu
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dropout(0.3))  #增加dropout防止过拟合
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dropout(0.3))#增加dropout防止过拟合
# 输出层,对应a-z
model.add(layers.Dense(27))

model.summary()

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])


# fit,里面要解释的参数只有epochs和batch_size,epochs是用全部训练集的用例训练几次
# 模型的意思,为什么要用同样的数据集重复训练多次模型呢(这里举个恰当的例子),
# batch_size是每次迭代用到几个用例。
history = model.fit(train_images.reshape(124800, 28, 28, 1), train_labels, epochs=20, batch_size=32
                    , validation_data=(test_images.reshape(20800, 28, 28, 1), test_labels))
print(history)
model.save("emnist_cnn.model")

Among them, the fully connected neurons can be appropriately increased to improve the accuracy

08f5879cd158474389ba364c5d315111.png

In the test set, the accuracy of the prediction is 91.65%. Compared with the ordinary neural network, the accuracy of the convolutional neural network is higher, but the algorithm is also more complicated.

Guess you like

Origin blog.csdn.net/qq_56864896/article/details/127787215