ResNeXt:Aggregated Residual Transformations for Deep Neural Networks

The main reason is that the authors propose ResNeXt: To improve the accuracy of the traditional model, is to deepen or widen the network, but with the increasing number of super-parameters (such as the number of channels, filter size, etc.), network design and calculation of difficulty the cost will increase. Therefore, the proposed
ResNeXt structure can improve the accuracy without increasing the complexity of the parameters, but also reduces the number of hyper-parameters (thanks to the topology of sub-modules, like the back of speaking).

The author first mentioned in the paper VGG, VGG mainly stackable network to achieve, before ResNet also borrowed the idea. Then mention Inception Series network, simply, is split-transform-merge strategy, but Inception Series network has a problem: Ultra parameters of the network settings of targeted relatively strong, when applied in other data sets need to modify many parameters, so scalability general.

So the focus here, the authors proposed network ResNeXt In this paper, while using VGG stack of ideas and Inception of split-transform-merge ideas, but can be extended relatively strong, can be considered to increase the accuracy rate without substantially changing or it reduces the complexity of the model. With cardinality a term referred to herein, is the original interpretation The
size of The SET of Transformations, Fig1 below is the right way cardinality = 32, where each note is polymerized topology is the same (and this is also the difference between Inception reduce the burden on design

Increasing cardinality point to understand more effective than increasing the depth and width.

Of course, there are some data demonstrate the superiority ResNeXt network, for example the original words: In particular, a 101-layer ResNeXt is able to achieve better accuracy than ResNet-200 but has only 50% complexity.

Table1 include internal structure ResNet-50 and ResNeXt-50, and further description is not the last two lines of the complexity parameter difference therebetween.
 
Next, the author began to speak to the proposed new block, for example fully connected layer (Inner product) is concerned, we know that this formula is less fully connected layers:


Guess you like

Origin www.cnblogs.com/ziwh666/p/12483765.html