Keras Deep Learning - The Effect of Batch Size on the Accuracy of Neural Network Models

Get into the habit of writing together! This is the 10th day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

The effect of batch size on model accuracy

In the original neural network , we used batch size ( batch size) for all models we built 64. In this section, we will study the effect of changing the batch size on accuracy. To explore the effect of batch size on model accuracy, let’s compare two cases:

  • batch size is 4096
  • batch size is 64

When the batch size is large, the number of weight updates in each epochis small. When the batch size is small epoch, multiple weight updates are made for each , because in each epoch, all the training data in the dataset must be traversed, so if each is batchused to calculate the loss value with less data, it will cause each to epochhave more More batchto traverse the entire dataset. Therefore, the batchsmaller the size, the better the accuracy of the same epochtrained model. However, you should also make sure that the batch size is not so small as to cause overfitting.

In the previous model, we used a model with a batch size of 64. In this section, we continue to use the same model architecture and only modify the batch size for model training to compare the impact of different batch sizes on model performance. Preprocess the dataset and fit the model:

(x_train, y_train), (x_test, y_test) = mnist.load_data()

num_pixels = x_train.shape[1] * x_train.shape[2]
x_train = x_train.reshape(-1, num_pixels).astype('float32')
x_test = x_test.reshape(-1, num_pixels).astype('float32')
x_train = x_train / 255.
x_test = x_test / 255.

y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
model = Sequential()
model.add(Dense(1000, input_dim=num_pixels, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])

history = model.fit(x_train, y_train,
                    validation_data=(x_test, y_test),
                    epochs=50,
                    batch_size=4096,
                    verbose=1)
复制代码

The only changes to the code are the batch_sizeparameters . Plot epochthe train and test (the code for plotting the graph is exactly the same as used in training the original neural network ):

Accuracy and loss values ​​for training and testing

在上图中可以注意到,与批大小较小时的模型相比,批大小较大时模型需要训练更多的 epoch 准确率才能达到了 98%。在本节模型中,刚开始的训练阶段,模型准确率相对较低,并且只有在运行了相当多的 epoch 后模型的准确率才能达到较高水平。其原因是,批大小较小时,在每个 epoch 中权重更新的次数要少得多。

数据集总大小为 60000,当我们以批大小为 4096 运行模型 500 个 epoch 时,权重更新进行了 500 × ( 60000 ÷ 4096 ) = 7000 500\times(60000\div4096)= 7000 次。当批大小为 64 时,权重更新进行了 500 × ( 60000 ÷ 32 ) = 468500 500\times(60000\div32)=468500 次。因此,批大小越小,权重更新的次数就越多,并且通常在 epoch 数相同的情况下,准确率越好。同时,应注意批大小也不能过小,这可能导致训练时间过长以及过拟合情况的出现。

相关链接

Keras深度学习——训练原始神经网络

Keras深度学习——缩放输入数据集提升神经网络性能

おすすめ

転載: juejin.im/post/7085492398142259236