Depth articles - classic history of the network model developed Deep Learning (viii) structure and features elaborate DenseNet

Skip to main content

Return to Deep Learning Network model development history of classic catalog

Previous: depth articles - classic History of Deep Learning Network model (seven)  elaborate Inception-ResNet structure and features

Next: depth articles - Deep Learning the history of the classic network model development (nine)  elaborate DarkNet structure and characteristics of each model and with precision

 

In this section, DenseNet elaborate structure and characteristics, the next section DarkNet elaborate structure and characteristics of the models and the precision

 

Papers Address: Densely Connected Convolutional Networks

 

II. Network Classic (Classic Network)

8. DenseNet 

DenseNet is CVPR 2017 Best Paper. DenseNet (Dense Convolutional Network) paper presented mainly ResNet and Inception network and do comparison, ideologically learn, but it is a brand new structure, the network structure is not complicated, but very effective, beyond ResNet on CIFAR index. It can be said DenseNet ResNet absorbed the most essential part, and do this on a more innovative work, so that network performance is further improved.

. (1) compared to ResNet, DenseNet proposed a more radical intensive connection mechanism: that all layers are connected to each other, specifically, it is that each layer will accept it in front of all layers as additional input. In DenseNet, the connection channels are of dimensions for each layer and all layers of the front (concatenation) together (in size there must be security feature map of the individual layers is the same), as the input layer. For a \large L network layer, DenseNet contains  \large \frac{L(L + 1)}{2} connections. Further, FIG DenseNet directly concatenation feature from different layers, which can be achieved reuse characteristics, improve efficiency, and this characteristic is DenseNet ResNet main difference.

 

(2) DenseNet wherein Neural Network Graphics (Dense Block).:

 

(3) In DenseNet in FIG consistent feature size of each layer, concat up on channels. Dense Block nonlinear combination function  \large H(x) is used in \large BN + 3 \times 3 \; conv + ReLU a configuration. Further, it is worth noting that, with different ResNet, all Dense Block uniform output after convolution of the individual layers \large k one feature map, i.e., the number of channels feature map is obtained \large k, or using \large k a convolution kernel. \large kCalled the growth rate in DenseNet in, it is a super parameter. In general the use of smaller \large k (e.g., 12), better performance can be obtained. Enter the number of channels added layer of feature map is  \large k_{0} + k(l - 1), so as to increase the number of layers, despite  \large k setting a smaller, but the input Dense Block will be very much, but this is due to the characteristics caused by reuse, each layer only  \large k feature their own unique.

 

(4) Since the following input will be very large, the Dense Block internal layer may be employed to reduce the computational Bottleneck, it is to increase 1 x 1 conv original configuration. The: \large BN + 1 \times 1 \; conv + ReLU \rightarrow BN + 3 \times 3 \; conv + ReLU referred DenseNet-B

       Wherein  \large 1 \times 1 \; conv obtain  \large 4k a feature map the role is to reduce the number of features, so as to enhance computational efficiency.

 

(5) In the CNN network, typically using pooling or strides> conv 1 to reduce the size of the feature map, and DenseNet connection requires intensive consistent feature map size. To solve the problem DenseNet reduce the size of the feature map, in DenseNet Dense Block + Transition network structure in which many layers Dense Block is a module comprising a consistent feature map size of each layer, a dense connection between the layers. Transition module and connecting two adjacent Dense Block, and by pooling making feature map size reduction.

 

(6) For Transition layer, it is mainly connected to two adjacent Dense Block, and reduce the size of the feature map size. Transition layer comprising a  \large 1 \times 1 \; conv and  \large 2 \times 2 \; avg\_ poolingstructure:\large BN + 1 \times 1 \; conv + ReLU \rightarrow 2 \times 2 \; avg\_ pooling . Further, Transition layer may function as a compression module. Transition layer is assumed that the contact Dense feature map Block of channels is obtained  \large m, Transition layer may generate  \large \theta m feature (by convolving the layer \large \theta \in (0, 1], \large \thetathe compression factor compression rate). When  \large \theta = 1 , the number of feature does not change after Transition layer, i.e. uncompressed, when the compression factor is less than 1, such a structure is called DenseNet-C. For Dense Block Bottleneck layer structure and a compression factor of less than 1 Transition combined structure is referred DenseNet-BC.

 

. (7) DenseNet network structure:

 

. (8) DenseNet advantages:

   ①. Since the dense connections, DenseNet enhance the backpropagation gradient, so that the network training easier. Since each error signal can direct the final, realized implicit "deep supervision", to alleviate the gradient phenomenon disappears, so that the network can reach deeper.

   ②. Network narrower, fewer parameters, and more computationally efficient.

   ③. Concatenation feature is achieved using short-circuited to achieve the characteristics reuse, and with a smaller growth rate, unique to each feature map layer is relatively small.

   ④. Since the feature reuse, more efficient use of the feature, the final classifier uses low-level features.

 

. (9) DenseNet disadvantages:

   ①. DenseNet during training memory is consumed, it is not preferable because the algorithm brings.

   ②. Current depth learning frame is not well supported dense DenseNet connection, it is only by means of repeated splice (concatenation) operation, with the output current and the output layer before stitching, and then passed to the next layer. For most frameworks (such as TensorFlow), each splicing operation will open up new features of memory to hold the splice. This results in a  \large L layer network, and consumes considerable  \large \frac{L(L + 1)}{2} memory layer of the network (the  \large i output layer is stored in the memory  \large L - i + 1 parts).

 

                  

 

Skip to main content

Return to Deep Learning Network model development history of classic catalog

Previous: depth articles - classic History of Deep Learning Network model (seven)  elaborate Inception-ResNet structure and features

Next: depth articles - Deep Learning the history of the classic network model development (nine)  elaborate DarkNet structure and characteristics of each model and with precision

Published 63 original articles · won praise 16 · views 5996

Guess you like

Origin blog.csdn.net/qq_38299170/article/details/104241860