【读书1】【2017】MATLAB与深度学习——池化层(1)

由于它是一个二维的运算操作，文字解释可能会导致更多的混淆，因此让我们来举一个例子。

As it is a two-dimensional operation, andan explanation in text may lead to more confusion, let’s go through an example.

考虑4×4像素的输入图像，它由图6-15所示的矩阵表示。

Consider the 4×4 pixel input image, which isexpressed by the matrix shown in Figure 6-15.

在这里插入图片描述

图6-15 4×4像素的输入图像The four-by-four pixel input image

在像素互不重叠的条件下，我们将输入图像的像素组合成2×2矩阵。

We combine the pixels of the input imageinto a 2×2 matrixwithout overlapping the elements.

一旦输入图像通过池化层，它将收缩成2×2像素的图像。

Once the input image passes through thepooling layer, it shrinks into a 2×2 pixel image.

图6- 16示出了使用平均池化和最大池化的输出结果。

Figure 6-16 shows the resultant cases ofpooling using the mean pooling and max pooling.

在这里插入图片描述

图6-16 两种不同方法池化后的结果The resultant cases ofpooling using two different methods

实际上，在数学意义上，池化过程是一种卷积运算。

Actually, in a mathematical sense, thepooling process is a type of convolution operation.

与卷积层的区别在于卷积滤波器是固定的，且池化层的卷积区域互不重叠。

The difference from the convolution layeris that the convolution filter is stationary, and the convolution areas do notoverlap.

下一节中提供的示例将对此进行详细说明。

The example provided in the next sectionwill elaborate on this.

池化层在一定程度上能够补偿偏心和倾斜的物体。

The pooling layer compensates for eccentricand tilted objects to some extent.

例如，池化层可以提高对图像中猫的识别，猫所处位置可以偏离输入图像的中心。

For example, the pooling layer can improvethe recognition of a cat, which may be off-center in the input image.

此外，由于池化处理减小了图像大小，所以对于降低计算量和防止过拟合非常有益。

In addition, as the pooling process reducesthe image size, it is highly beneficial for relieving the computational loadand preventing overfitting.

示例：MNIST（Example: MNIST）

我们实现一个神经网络，使用它获取输入图像并识别图像所代表的数字。

We implement a neural network that takesthe input image and recognizes the digit that it represents.

训练数据采用MNIST（Mixed National Institute of Standardsand Technology，国家标准与技术混合研究所）数据库，它包含70000个手写数字图像。

The training data is the MNIST database,which contains 70,000 images of handwritten numbers.

一般来说，60000幅图像用于训练，剩下的10000幅图像用于验证测试。

In general, 60,000 images are used fortraining, and the remaining 10,000 images are used for the validation test.

每幅数字图像是一个28×28像素的黑白图像，如图6-17所示。

Each digit image is a 28-by-28 pixelblack-and-white image, as shown in Figure 6-17.

在这里插入图片描述
图6-17 MNIST数据库中的28x28像素黑白图像A 28-by-28 pixelblack-and-white image from the MNIST database

考虑到训练时间，该示例仅使用10000幅图像，训练数据和验证数据的比例为8:2。

Considering the training time, this exampleemploys only 10,000 images with the training data and verification data in an8:2 ratio.

——本文译自Phil Kim所著的《Matlab Deep Learning》

更多精彩文章请关注微信号：在这里插入图片描述