Algorithm yolov5 for identification and detection of rubbish on river surface

The river surface garbage identification and detection system adopts yolov5 forgetting model computer vision technology. The river surface garbage identification and detection algorithm installs cameras on the river to monitor and automatically identify and record water surface garbage in real time, and promptly notify the environmental protection department for processing. In recent years, object detection algorithms have made great breakthroughs. The more popular algorithms can be divided into two categories. One is the R-CNN algorithm based on Region Proposal (R-CNN, Fast R-CNN, Faster R-CNN). They are two-stage and need to use heuristics first. The method (selective search) or the CNN network (RPN) generates a Region Proposal, and then performs classification and regression on the Region Proposal. And the other category is Yolo.

Before introducing the Yolo algorithm, first introduce the sliding window technology, which is helpful for us to understand the Yolo algorithm. The idea of ​​the target detection algorithm using the sliding window is very simple, it transforms the detection problem into an image classification problem. The basic principle is to use windows of different sizes and ratios (aspect ratio) to slide on the whole picture with a certain step size, and then perform image classification on the areas corresponding to these windows, so that the detection of the whole picture can be realized up. Overall, the Yolo algorithm uses a separate CNN model to achieve end-to-end target detection. The entire system is shown in Figure 5: first resize the input image to 448x448, then send it to the CNN network, and finally process the network prediction results to get detected target. Compared with the R-CNN algorithm, it is a unified framework, which is faster, and the training process of Yolo is also end-to-end.

For a target detection algorithm, we can usually divide it into 4 general modules, including: input, benchmark network, Neck network and Head output. The YOLOv5 algorithm has 4 versions, including: YOLOv5s, YOLOv5m , YOLOv5l, and YOLOv5x. This article focuses on YOLOv5s. Other versions deepen and widen the network on the basis of this version.

  • Input - The input represents the input image. The input image size of the network is 608*608, and this stage usually includes an image preprocessing stage, which is to scale the input image to the input size of the network, and perform operations such as normalization. In the network training phase, YOLOv5 uses Mosaic data enhancement operations to improve the training speed of the model and the accuracy of the network; and proposes an adaptive anchor frame calculation and adaptive image scaling method.
  • Benchmark network - The benchmark network is usually a network of classifiers with excellent performance, and this module is used to extract some general feature representations. Not only the CSPDarknet53 structure is used in YOLOv5, but also the Focus structure is used as the benchmark network.
  • Neck network -Neck network is usually located in the middle of the benchmark network and the head network, and it can further improve the diversity and robustness of features. Although YOLOv5 also uses the SPP module and the FPN+PAN module, the implementation details are somewhat different.
  • Head output terminal - Head is used to complete the output of target detection results. For different detection algorithms, the number of branches at the output end is different, usually including a classification branch and a regression branch. YOLOv4 uses GIOU_Loss to replace the Smooth L1 Loss function, thereby further improving the detection accuracy of the algorithm.

The Adapter interface defines the following methods:

public abstract void registerDataSetObserver (DataSetObserver observer)

Adapter represents a data source. This data source may change, such as adding data, deleting data, and modifying data. When the data changes, it must notify the corresponding AdapterView to make corresponding changes. In order to realize this function, the Adapter uses the observer mode. The Adapter itself is equivalent to the observed object, and the AdapterView is equivalent to the observer. Register the observer for the Adapter by calling the registerDataSetObserver method.

public abstract void unregisterDataSetObserver (DataSetObserver observer)

Unregister the observer by calling the unregisterDataSetObserver method.

public abstract int getCount () returns the number of data in the Adapter.

public abstract Object getItem (int position)

The data in the Adapter is similar to an array, and each item in it corresponds to a piece of data, and each piece of data has an index position, that is, position, and the corresponding data item in the Adapter can be obtained according to the position.

public abstract long getItemId (int position)

Get the id of the specified position data item, usually the position will be used as the id. In Adapter, relatively speaking, position is used more frequently than id.

public abstract boolean hasStableIds ()

hasStableIds indicates whether the id of the original data item will change when the data source changes. If it returns true, it means the Id remains unchanged, and if it returns false, it means it may change. The hasStableIds method of Adapter subclasses (including direct subclasses and indirect subclasses) provided by Android all return false.

public abstract View getView (int position, View convertView, ViewGroup parent)

getView is a very important method in Adapter, which will create corresponding UI items for AdapterView according to the index of the data item.

Guess you like

Origin blog.csdn.net/KO_159/article/details/131278491