[Deep Learning] Classification network structure RESNET RESNEXT DENSENET DPN MOBILE NET SHUFFLE NET

RESNET

Skip connection

 

RESNEXT

Split-transform-merge

Increasing the cardinality (the number of independent paths) to improve accuracy is more effective than deepening or expanding the network to improve accuracy. (why?)

 

DENSENET

Adding to phase parallel

Strengthen the shortcut, connect all layers directly together

 

DPN

Integrate the residual channel and densely connected path to achieve complementary advantages and disadvantages

-------------------------------------------The following is a lightweight network- ------------------------------------------

MOBILE NET

v1

1. Give up pooling and directly use stride=2 for convolution operation. [Other networks, look at the specific application

2. Use two-step convolution instead of one-step convolution (depth separable convolution), advantage: reduce the amount of calculation

3. Use two hyperparameters to control the balance between network calculation speed and accuracy, width adjustment parameters (adjust the number of channels in each layer), and resolution adjustment parameters (adjust the resolution of the input image)

(What is lightweight, why can this be lightweight?) Lightweight-fewer parameters

 

v2

1. Introduce shortcut structure (residual network)

2. Before depthwise, perform a 1*1 convolution to increase the number of channels of the feature map to achieve the expansion of the feature map. (Inverted residual block, the general residual block has more channels at both ends and fewer channels in the middle feature map (hourglass shape). And inverted residual block has fewer channels at both ends and more channels in the middle feature (shuttle shape)) (What are the benefits of the shuttle shape? Why not use resnet? ----The amount of parameters is already very small, adding a little parameter in the middle, the accuracy rate will be higher)

3. After pointwise, the relu activation function is abandoned. The linear activation function should be used to prevent relu from destroying the feature (why relu is damaged, linear will not)

 

Reference blog post: https://www.jianshu.com/p/854cb5857070

 

SHUFFLE NET

Group convolution, exchange channels between different groups. (Pros and cons of shuffle net compared with resnet)

 

 

 

Guess you like

Origin blog.csdn.net/Sun7_She/article/details/90476530