YOLOV5/YOLOV8 improvement: CVPR23: join the ConvNeXt V2 backbone, effectively increase points

Paper address:  https://arxiv.org/pdf/2301.00808v1.pdf

  The field of visual recognition saw rapid modernization and performance improvements in the early 1920s, driven by improved architectures and better representation learning frameworks. For example, modern ConvNets represented by ConvNeXt [52] have shown strong performance in various scenarios. Although these models were originally designed for supervised learning using ImageNet labels, they may also benefit from self-supervised learning techniques such as masked autoencoders (MAE). However, we found that simply combining the two approaches resulted in poor performance. In this paper, we propose a fully convolutional masked autoencoder framework and a new Global Response Normalization (GRN) layer, which can be added to the ConvNeXt architecture to enhance inter-channel feature competition. This co-design of self-supervised learning techniques and architectural improvements resulted in a new model family named ConvNeXt V2, which significantly improves the performance of the system

 

YOLOv5 improvements:

Improve with YOLOv5-7.0 version

1. Create a new convnextv2.py file and add the following code

# Copyright (c) Meta Platforms, Inc. an

Guess you like

Origin blog.csdn.net/m0_51530640/article/details/131126340