论文：Segmentation Is All You Need 阅读笔记

一、论文

Segmentation Is All You Need

https://arxiv.org/abs/1904.13300

二、论文笔记

1、基于RPN的方法对于图片里边一些特殊情况的物体的检测效果很差（召回率很差），比如这三种

效果差主要有两个原因：

（a）RPN高度依赖bounding box, 但是对于一些极端的例子，人工标注的ground truth 含有很多噪音

（b）NMS的阈值很难找到一个能偶合适各种例子的合适的阈值

扫描二维码关注公众号，回复： 6414815 查看本文章

2、创新点

（1）、提出a weakly supervised multimodal annotation segmentation (WSMA-Seg)，anchor-free and NMS-free，使用语义分割的方法（优点，避免了一些超参数的选择，缓解复杂的遮挡问题，像素级的语义标注相比bounding box更准确）

（2）、使用多模态（三种模态）的标注代替bouding box 标注。作为监督信息训练语义分割模型，并且设计了一个边界跟踪算法。

三种模态分别是物体的轮廓，边界，以及两个连接在一起的物体的边界。如图：

三种模态标注的制作过程:

Given an image with bounding box annotations, we first obtain an inscribed ellipse for each bounding box, then the interior mask (channel 0) is obtained by setting the values of pixels on the edge of or inside the ellipses to 1, and setting the values of other pixels to 0. Then, the boundary mask (channel 1) is obtained by setting the values of pixels on the edge of or within the inner width w of the ellipses to 1, and setting the rest to 0. Similarly, the boundary on the interior mask (channel 2) is generated by setting the values of pixels on the edge of or within the inner width w of the area of the elliptical overlap to 1.

边界跟踪算法：

（3）、使用语义分割的方法，那么效果就非常依赖语义分割的模型，因此提出multi-scale pooling segmentation (MSP-Seg) model,

multi-scale pooling utilizes four pooling kernals with sizes 1 × 1, 3 × 3, 5 × 5, and 7 × 7 to simultaneously conduct average pooling operations on the previous feature maps generated by residual blocks on skip connections

3、流程

训练流程

1、使用根据bouding box转化的多模态标注训练分割模型

测试流程

1、将分割模型输出的三个热图通过像素级逻辑回归操作转换为实例感知分割图

2、使用分割图执行轮廓跟踪操作以生成对象的轮廓，并且创建对象的边界框作为其轮廓的外接四边形

4、思考

想法非常好，但是做的工作很有限，效果也不是很有说服力（选择性的选了一些baseline做对比）

1、不知道最初分割模型输出的边界mask和最后使用边界跟踪算法识别出来的边界有啥区别

3、三个标注模态的意义解释的也不是很清楚

5、总结

one stage anchor free NMS-free 基于语义分割来做的

Git Hub paper list：https://github.com/zhiAung/Paper

论文：Segmentation Is All You Need 阅读笔记

猜你喜欢