[Edge detection] CaseNet self-made PPT

The following is the PPT of the group meeting report, which is posted to facilitate learning together.

I will give a brief introduction in three parts. The first part will introduce the problem description from the intuitive understanding and mathematical level, the second part will introduce the multi-class loss function proposed by the author, and finally the core network architecture. Two basic frameworks will be introduced first. Analyze how the author came up with the framework of this paper.

 

This is a semantic edge detection rendering of a street view image. The upper left corner is the original image, the upper right corner is the ground truth , and finally the rendering of this article.

First, compare the traditional edge detection, which is actually the difference between binary and multivariate problems. Semantic edge detection not only detects edges, but also assigns one or more semantic categories to each edge pixel.

This multiple here actually shows that it is a multi-label problem. Finally, let’s take a look at the picture on the left, and we can see the colorful ones. In fact, different colors represent different categories. The most conspicuous red represents the road. If you look carefully at the legend in the upper left corner, you can see some combined colors, such as buildings and pedestrians. Green represents a multi-label, that is, the same pixel is assigned multiple category semantic labels.

Next, we will introduce this problem from the mathematical level, and briefly focus on the input and output. The input is a picture, the output edge map, and the corresponding output is the edge probability of the pixel corresponding to the Kth semantic category. The three main contributions of this paper are a multi-class learning framework, a new nested structure, and a multi-class loss function. Next, a simple multi-class loss function is introduced.

 

It can be regarded as a cross-entropy that performs a binary classification for each category (binary cross-entropy is a Loss loss function commonly used in binary classification problems). Finally, the k results are superimposed. The second classification is to judge whether the pixel is an edge. Here, the multi-classification needs to judge what the pixel is. The author divides it into K two-class problems, that is to say, it judges whether the pixel is the first class, the second class to the Kth class. Class probabilities, and then stack the results.

where β is the percentage of non-edge pixels in the image to account for skew in the number of samples

 

 We immediately enter the introduction to the core network architecture.

 The first is the basic network, using the ResNet-101 framework, let's see what changes it has made.

Let's take a closer look at the basic architecture, mainly looking at the purple classification block here, 1 × 1 convolution plus bilinear upsampling; the edge probability of a pixel belonging to the Kth class is calculated by the sigmoid unit.

In this part, we introduce the deeply supervised nested architecture, and briefly mention the HED network here, because this part of this article mainly refers to the HED. However, he only performs binary edge detection and solves the binary problem, which needs to be extended to the multivariate problem in this article. This article connects the purple classification module just mentioned to the output of each residual block to generate 5 side classification activation maps. Finally, 5 activation maps are fused by slice cascade.

 [Paper reading] (edge ​​detection related) HED: Holistically-Nested Edge Detection_Clark-dj's blog - CSDN blog

The formula of the slice fusion part is expressed as formula 2 , and the deep supervision means calculating 6 losses.

 

After reviewing the basic architecture and deeply supervised nested architecture, we analyze whether these two architectures are suitable for our task.

Now let's talk about the model of this article, let's take a brief look at its various modules.

 

 

Each module was introduced in detail before and now, let's take a look at the difference between the gray feature extraction module and the purple classification module.

 If you need PPT, you can send me a private message!

Guess you like

Origin blog.csdn.net/dujuancao11/article/details/123969195