YOLOV3 of Deep Learning Target Detection Series

1. Gossip

          The author said that yolov3 is equivalent to a technical report, because he still has a lot of other troubles to do. So yolov3 is equivalent to the original yolo version, integrating existing advanced technology for improvement. However, in terms of industry feedback, it is still very good. This means that some people do things casually that others can't do in their lifetime. But the greater significance of this paper is, let us stop and think, What This All Means!   In other words, what will this bring to human society? On the way of history, the great gods invented the wheel for this car, and we are responsible for pushing it forward. But what if there is a big fire pit ahead? Do you want to keep pushing? The Great God has made it clear that I quit! So the paper ends in  We owe the world that much ! Perhaps everyone should not selfishly think that technology is just a tool for making ends meet. Just like during the Japanese war of aggression against China, people with conscience cannot say that I betrayed my country and people in order to support my family. I don't know the hands (or more than one pair) that exist outside of human beings, what plans he has for us. I also abandon the content written in those books, the so-called civilization and development (just to allow people to spend more time to be foolish?). There may be many uncontrollable factors, but for me, the only thing I will do is not to become a tool for the unrighteous! This is the last perseverance as a helpless little person, paying tribute to everyone who adheres to their conscience!

2.yolov3

              For yolov3, the two most important points should be darknet53 and FPN

        1.darknet53

                    Darknet53 is a set of network structure designed based on darknet19. First of all, we see from the network structure that darknet53 is equivalent to a combination of darknet19 and resnet.

         

            The following reflects the effect of darknet53. Compared with 19, its speed is greatly reduced, but the accuracy is increased by 3%. This is a great achievement, and its speed is much higher than the need for real-time detection. Compared with resnet, the accuracy of the two is not much different, but the speed is doubled than resnet. So from the perspective of the usable model, this is undoubtedly an excellent network structure design.

 

                           

       2.FPN

                    FPN is another separate paper. The network design here just reflects the idea of ​​FPN, so let's explain the relevant knowledge. We all know that there will be a so-called passthrough layer in YOLOV2 . This layer stitches the feature 26*26 before the last pooling and the last output feature 13*13, and then outputs it. But this kind of splicing is limited and crude, and the input image is still far from the feature map. FPN takes this idea of ​​high-level and low-level information fusion to its extreme. As shown in the figure below, we can see that fpn will up-sample the high-level information and then merge it with the low-level features. This low-level feature has more semantic information. And the last three layers will have corresponding output.

                    We know that yolov3 will first cluster the sizes of the original 9 anchors, (10×13), (16×30), (33×23), (30×61), (62×45), (59× 119) ,(116×90),(156×198),(373×326). Arrange the sizes, the higher the feature map has a larger size, each layer corresponds to 3 sizes.

             

 

     3. Performance

                We see that yolov3 has fewer modifications as a whole, but its current mainstream position in the market is enough to show how excellent the model is. The following figure reflects the measured effect of yolov3. On the premise of maintaining high accuracy, we can see that he has already left other networks far behind in terms of speed. Of course, yolov3 is not perfect. When the IOU threshold is between 0.5-0.95, his accuracy will decrease. But others have also explained that the difference between 0.5 iou and 0.3 iou is very small for human eyes. Since everyone can't see it, then blindly pursuing high iou is actually not very meaningful.

 

3. Summary

      This article introduces the modification of yoloV3 relative to v2, redesigns the darknet53 network, and draws on the idea of ​​FPN for feature fusion. It is relatively simple to understand. Everyone encourages. Like the author of V3, I also have a lot of annoying things to deal with, but I will not withdraw from the graphics world, nor with the AI ​​circle. Of course, it's not because of how much I love AI, but because a small character like me doesn't seem to be worthy of the word quit.

      A song of Eminem's lose yourself to encourage each other, cherish every opportunity that appears in front of you, look up at the stars and keep your feet on the ground...               

                         Look, if you had one shot, or one opportunity

                               To seize everything you ever wanted, one moment

                                         Would you capture it or just let it slip?             

Eminem-Lose Yourself's first Oscar-winning rap song, the theme song of the movie "8 Miles", and the best rap song in Grammy

 

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/gaobing1993/article/details/108417974