1. Introduction to PyTorch Examples (1): Image Classification

1) Use Pytorch to build the most basic network Lenet to realize image classification. Although the example sparrow is small, it has all internal organs and is very suitable for getting started. Lenet was proposed by Yann LeCun et al. in "Handwritten Digit Recognition with a Back-Propagation Network" in 1990. It is the HelloWorld of convolutional neural networks.

2) For explanation, refer to the article "Introduction to PyTorch Examples (1): Image Classification" , and for the executable code, please refer to "Complete Code Link" .
3) We use CIFAR-10 as the data set, which contains 60,000 pictures of 10 categories, each of which has a size of 32x32, including 50,000 training pictures and 10,000 test pictures. The following images are some examples:

2. Pytorch hands-on practice: pytorch model classification network

1) For the explanation and code, mainly refer to the Zhihu article "Pytorch Model Subdivision Network" , the code is standardized and easy to read, but the original code cannot run. I debugged and modified it and it can run through. Xiaobai can refer to this "Source Code of Vehicle Subdivision Classification" .
2) This project is about vehicle classification, resnet50 network, which can be used for basic learning.
3) Download data: Download the link "10 Types of Vehicle Type Recognition Dataset" , a public vehicle dataset, training model, used for vehicle recognition and vehicle classification. , using the provided 2000 high-resolution images annotated for vehicle scene classification of 10 types of cars. Tag information: bus,taxi,truck,family sedan,minibus,jeep,SUV,heavy truck,racing car,fire engine.

3. Common Convolutional Neural Networks

1) [Lenet] ( Lenet of Convolutional Neural Network - Zhihu (zhihu.com) )

2) [Alexnet] ( Alexnet of Convolutional Neural Network - Zhihu (zhihu.com) )

3) [VGG] ( VGG of Convolutional Neural Network - Zhihu (zhihu.com) )

4. Target detection

1), " Understanding Faster RCNN in One Article "

After the accumulation of R-CNN and Fast RCNN, Ross B. Girshick proposed a new Faster RCNN in 2016. Structurally, Faster RCNN has integrated feature extraction (feature extraction), proposal extraction, bounding box regression (rect refine), The classification is integrated in a network, which greatly improves the overall performance, especially in the detection speed.

Figure 1 Basic structure of Faster RCNN (from the original paper)

According to the author, as shown in Figure 1, Faster RCNN can actually be divided into 4 main contents:

Conv layers. As a CNN network target detection method, Faster RCNN first uses a set of basic conv+relu+pooling layers to extract the feature maps of the image. The feature maps are shared for subsequent RPN layers and fully connected layers.
Region Proposal Networks. The RPN network is used to generate region proposals. This layer judges whether the anchors are positive or negative through softmax, and then uses the bounding box regression to correct the anchors to obtain accurate proposals.
Roi Pooling. This layer collects the input feature maps and proposals, extracts the proposal feature maps after synthesizing these information, and sends them to the subsequent fully connected layer to determine the target category.
Classification. Use the proposal feature maps to calculate the category of the proposal, and at the same time bounding box regression again to obtain the final precise position of the detection frame.

Figure 2 shows the network structure of faster_rcnn_test.pt in the VGG16 model in the python version. It can be clearly seen that the network is for an image of any size PxQ:

First scale to a fixed size MxN, and then send the MxN image to the network;
The Conv layers contain 13 conv layers + 13 relu layers + 4 pooling layers;
The RPN network first undergoes 3x3 convolution, and then generates positive anchors and corresponding bounding box regression offsets respectively, and then calculates proposals;
The Roi Pooling layer uses proposals to extract the proposal feature from the feature maps and send it to the subsequent full connection and softmax network for classification (that is, what object is the classification proposal).

Figure 2 faster_rcnn_test.pt network structure (pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt)

2), " Stroke the pytorch official FasterRCNN code "

3), Non-Maximum Suppression (Non-Maximum Suppression, NMS) :

The NMS algorithm is a common algorithm for removing redundant detection frames in target detection. Simply put, only the best result of a certain class is kept in a certain area, and the rest of the results are suppressed (shielded). In the same category, the confidence is the first choice, and the confidence is high, and the surrounding intersection is suppressed (deleted) and the low confidence box is large. Reference articles: " Fundamentals of Deep Learning (2)---NMS ", " NMS Algorithm Implementation ", " NMS Summary Theory Derivation "

4), " Evaluation Indicators for Target Detection Tasks and Multi-Classification Tasks "

In addition, there is a very useful reference article " Calculation Method of Target Detection mAP and ROC Index ".

5), [ target detection (Faster RCNN)] principle | Pytorch official source code explanation | VGG | ResNet | ResNet50 FPN | ReXNets

6), Chen Yun's " Learning Faster R-CNN from the Perspective of Programming Implementation (with Minimalist Implementation ")

7) An explanation of the principle in English is also very good.

《Faster R-CNN: Down the rabbit hole of modern object detection》

8) Wz's bilibili, collection and video list: " Deep Learning-Target Detection "

9) bilibili video explanation: " 1.1 Faster RCNN Theory Collection "

10) bilibili video explanation: " 1.2Faster RCNN source code analysis (pytorch) "

The blogger said that the installation environment cannot be installed successfully with pycocotools:

~~pycocotools(Linux: pip install pycocotools; Windows: pip install pycocotools-windows(no need to install vs))~~

Finally, the environment using anoconda was successfully installed: conda install pycocotools

The corresponding github source code, " faster_rcnn's github source code and explanation "

11）

Summary 3-Getting started with common deep learning networks: pytorch examples