Lenet 学习总结

lenet的结构：

data（28*28）---卷积层1（20个5*5的卷积核，偏移1）--->20*24*24---池化层1（大小2，偏移2）--->20*12*12---卷积层2（50个5*5的卷积核，偏移1）

-->50*8*8---池化层2（大小2，偏移2）--->50*4*4---IP1（全连接层，输出500）--->500*1---IP2（全连接层，输出10）--->10*1---label--->loss

其中存在理解问题的点：

1、data（28*28）---卷积层1（20个5*5的卷积核，偏移1）--->20*24*24

偏移量1，一个卷积核生成一个24*24的数据，24=28-5+1，故生成的数据位20*24*24

2、20*12*12---卷积层2（50个5*5的卷积核，偏移1）-->50*8*8

理解的是20个12*12的数据都经过一个卷积核后进行累加形成一个8*8的数据，有50个卷积核，故输出50*8*8

3、50*4*4---IP1（全连接层，输出500）--->500*1

50*4*4展开是800个数据，输出是500，参数实际上有800*500

4、为什么用两个全连接层？

5、全连接层的作用？

6、卷积层概括结构信息，池化层概括平移信息，理解缩放及平移的表示....

7、从网络结构看，没有旋转不变的特性，那种layers可以实现呢？

优化：

训练好模型后，测试会发现我们使用画图工具写的数字识别率并不高，我觉得很大部分原因是训练集与我们写的分布实际上是

不一致的，为了增强泛化能力，加入了dropout层：

...
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "ip1"
  top: "ip1"
  dropout_param {
    dropout_ratio: 0.4
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
...

测试了0.3,0.4,0.5,0.6,0.7, 0.4是效果最好的，直接用画图工具写的识别率有所上升，上个图

猜你喜欢