CNN，卷积神经网络学习记录

通过对斯坦福大学2017秋季，卷积神经网络的公开课CS231n，做以总结，辅助学习Tensorflow
地址：https://cs231n.github.io/convolutional-networks/

Convolutional Neural Networks（CNN）：

首先卷积神经网络和传统的神经网络类似都需要接受一些inputs，也拥有可以进行学习的weight和biases，他也有激励函数和loss函数等等

卷积神经网络的架构：

　　首先他也有input和output 的layers， input为数据输入层，output为结果输出层。

　　在图像识别领域，input代表了一系列的输入的图片，比如[32,32,3]就是32x32像素的3原色组成的彩色图片。

　　CONV layer，也就是卷积层，卷积层会计算连接到input区域的weight，会用到激励函数，举例结果可能是[32,32,12]如果我们设置了12个filter的话。

　　POOL layer，池层的主要作用是压缩input的大小，比如[16,16,12]

　　FC layer(fully-connected layer) 把像素等值转化为一个[1,1,10]的值（因为数字从0-9）得出结果

In this way, ConvNets transform the original image layer by layer from the original pixel values to the final class scores. Note that some layers contain parameters and other don’t. In particular, the CONV/FC layers perform transformations that are a function of not only the activations in the input volume, but also of the parameters (the weights and biases of the neurons). On the other hand, the RELU/POOL layers will implement a fixed function. The parameters in the CONV/FC layers will be trained with gradient descent so that the class scores that the ConvNet computes are consistent with the labels in the training set for each image.

流程图：

接下来做一下各个层的详细解释以及应用：

首先是卷积层(CONV) ：在卷积层中，当一个高维度的input传入时，比如3维的图片，仅仅会有一部分的神经元去进行连接，连接的区域就叫做，感知区域(receptive field)，拿图片举例，在连接过程中，width 和height是可以仅仅连接到感知区域的神经元，但是depth必须是全连接，举例说明，一个input是[32x32x3]，如果感知区域是[5x5]则卷积层的weight就为[5x5x3]也就是75个weights，加上一个bias。

如图：

Spatial arrangement：我们解释了卷积层如何处理input，然后处理完input后，如果处理卷积层的output，这里有三个参数控制了输出的大小，depth（深度），stride（步长），zero-padding（补余）。

depth：代表了有多少个filter，也就是在一个相同的region中，有多少个神经元。

stride：步长，代表整个filter每次移动多少个像素。

zero-padding：用来保证输入和输出的width和height相同，用0来补充矩阵。

CNN，卷积神经网络学习记录

Convolutional Neural Networks（CNN）：

猜你喜欢