table of Contents
Large asymmetric decomposition convolution filters
Redesign pooling layer
Auxiliary constructor
Use labels smoothing
Reference material
In the " deep learning face questions 20: GoogLeNet (Inception V1) " and " deep learning face questions 26: GoogLeNet (Inception V2) " in the first two editions Inception made the introduction, the following major explained V3 version of innovation
Large asymmetric decomposition convolution filters |
InceptionV3 used in asymmetric network convolution in a deep position, his advantage is the model without lowering effect, reduce the size of the model parameters in the " deep learning face questions 27: Asymmetric convolution (Asymmetric Convolutions) "in introduced.
end_point = 'Mixed_6d'
with tf.variable_scope(end_point):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, depth(160), [1, 7],
scope='Conv2d_0b_1x7')
branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
scope='Conv2d_0c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],
scope='Conv2d_0b_7x1')
branch_2 = slim.conv2d(branch_2, depth(160), [1, 7],
scope='Conv2d_0c_1x7')
branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],
scope='Conv2d_0d_7x1')
branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],
scope='Conv2d_0e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],
scope='Conv2d_0b_1x1')
net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
end_points[end_point] = net
Redesign pooling layer |
There are two ways to reduce the network parameters:
The left approach is to directly pooling, which would reduce the performance of the network, because he had compression feature map, and perhaps this is the bottleneck of the network;
The right approach is to increase the number of channels, and then pooling, which would increase the number of computation.
Therefore InceptionV3 used in the following manner pooling:
Left and right are the same, but the right is a streamlined version of representation
His approach is again convolution, pooled again, and then merge the final result.
This would not only reduce the parameters, but also can avoid the bottleneck represented.
Auxiliary constructor |
Removing the first auxiliary builder.
Use labels smoothing |
In the " depth study interview questions 27: Asymmetric convolution (Asymmetric Convolutions) " has been spoken, it has the effect of preventing over-fitting.
Reference material |
Rethinking the Inception Architecture for Computer Vision
GoogLeNet的心路历程(四)
https://www.jianshu.com/p/0cc42b8e6d25