[Deep Learning] CVPR2019-PAPER IDEAS

1. driving stereo: a large-scale dataset for stereo matching in autonomous driving scenarios

This friend's work is mainly in the production of in-depth data sets, and the process of making the data sets wrote this paper. The main reason is that the depth label is not easy to obtain, and the direct use of radar is not accurate enough. He combined other information to make a more accurate label~ (I know for the first time that the method of making a label can also write an article. I always thought it was based on it. The pictures and label training process can be sent as paper...emmmm)

 

2. pose2seg

I did something similar before, here is the pose feature + pose positioning point first, and then the segmentation result is obtained.

I have done pose positioning point + image before and got the segmentation result.

 

3. cyclic guidance for weakly supervised jiont detection and segmentation

Weakly supervised: only classification labels, training detection and seg tasks

Use the classified heat maps as the pseudo-labels for detection, and the detected prediction map as the pseudo-labels for segmentation, and the results of the segmentation are returned to optimize the detection and form a ring~

 

4. visual attention consistency under image transforms for multi-label image classification

This is one of the most interesting tasks I saw this morning. Although I did it on classification tasks, I feel that the scope of use is not limited to classification.

The main idea: The feature extracted from the original image and the flipped image is theoretically flipped, but the actual situation is not like this. They input the original image and the flipped image into the network to extract the features, and then flip one of the features and the two features. Be a consistency loss, over.

The thinking is simple, clear at a glance, very reasonable ~ curious why this job is not oral. .

 

5. bag of tricks for image classification with convolutional neural networks

a. learning rate warm up

b. mix up training

c. label smoothing

d. cosine learning rate decay

e. knowledge distillation

It seems to be a combination of the previous methods. . Too many people. . Failed to ask. .

 

6. libra r-cnn: towards balanced learning for object detection

Three innovative points, 1. I don’t understand. 2. I don't understand, it's probably because of combining information of different levels. 3. I think the third innovation point is more interesting. Detection will produce the results of classification and detection. Some box classification results are correct, but the detection box is not accurate. Some detection boxes are more accurate but the classification results are incorrect. Classification and detection are separately constrained by the two losses. When training at the same time, this mismatch will affect each other. For example, if the classification is very good, the detection is not enough, then this detection is difficult to train, and they improve the loss Curves, so that those that are well classified can still improve the detection effect~ and then there will be an improvement

 

7. adaptively connected neural networks

No one was left when this went. . I probably took a look, self, cnn, mlp three connection methods have made a balance ~ the idea is quite simple ~ but it is a bit like pruning. . .

 

 

——————————————————————————————————————————————————————————————

 

 

1. Dual attention network for sene segmentation

Two branches, calculate the attention in the hw and c dimensions respectively, then merge the results of the two branches, and this part is added at the end of the network

 

2. decoders matter for semantic segmentation: data-dependent decoding enables flexible feature aggregation

Like~

Expand the number of channels, and then divide the channel into four equal parts and place it in the front, which is equivalent to 2 times the upsample. This upsample is an upsample with parameters. The parameters are stored in the weight. The weight constraint is imposed during the calculation process, so that the upsample image is downsampled Revert back to the original image afterwards to minimize losses. This part is added at the end of the network to improve the effect.

 

3. compressing convolutional neural networks via factorized convolutional filters

Like~

A method to determine the pruning strategy. The difference from drop out is that drop out is random, and this method is selected through certain constraints.

The constraint is: each weight corresponds to a v (take 0 or 1) to determine whether the weight is retained or removed, but the discrete value cannot solve the optimal value. They think of a method. If v is 2-dimensional, its value is total Four can be regarded as the intersection of a square with these four points as the vertices and its circumscribed circle. The solution value can be obtained by seeking the optimal value of these two areas and then finding the intersection. You can know whether the corresponding point needs to be kept or removed.

 

4. context-reinforced semantic segmentation

Through reinforcement learning, the effective part of the segmentation result is found as a pseudo-label and added to the training process to improve network performance. The reason that the confidence map has no effect before is that the results of the training are directly selected at the beginning, and the high confidence is not necessarily the accurate classification, so at the beginning, they added reinforcement learning to help the network distinguish which is good and which is bad.

 

5. A simple pooling-based design for real-time salient object detection

In the saliency detection task, various structures for extracting multi-scale information are added at different layers of the network. Combining saliency advice and edge extraction tasks to improve the effectiveness of saliency detection.

 

6. Progressive image deraining networks: a better and simpler baseline

It's very novel~

The task is to take pictures to rain. The current network structure of this task is becoming more and more complex, but the author found a relatively simple structure and very work. No new structure was proposed, but many redundant structures were removed. However, during the rebuttal, the reviewer will also say that there is no new method~ but it still passed.

 

7. Deep flow-guided video inpainting

The task is to fill in the missing piece in the video. Use video flow to select pixels from other frames to complete the missing part of the current pixel.

 

8. Towards instance-level image-to-image translation

The task is to turn the picture from day to night. Use different conversion networks for different objects to train (mainly used to modify the front parameters of the network used in the test), which is more targeted. The test is a network structure to ensure that there are no gaps in the middle of the picture.

Guess you like

Origin blog.csdn.net/Sun7_She/article/details/93474139