目标检测中基础概念之IOU、NMS及SoftNMS

转载请注明原出处：https://blog.csdn.net/ouyangfushu/article/details/87438585

作者：SyGoing

QQ: 2446799425

提到目标检测就不可避免的会经常涉及到IOU和NMS,无论采用何种目标检测算法，算法再牛逼都后会用到这些概念。虽然简单，但是还是有必要打好基础。

一、IOU

IOU为候选检测框的交集与并集的比，比较简单，如图所示

一目了然，代码实现也很简单这里给出C++代码实现：

struct Bbox

{

int x1;

int y1;

int x2;

int y2;

};

float IOU(Bbox a, Bbox b)

{

float xx1 = std::max(a.x1, b.x1);

float yy1 = std::max(a.y1, b.y1);

float xx2 = std::min(a.x2, b.x2);

float yy2 = std::min(a.y2, b.y2);

float iw = (xx2 - xx1 + 1);

if (iw < 0) {

iw = 0.f;

}

float ih= (yy2 - yy1 + 1);

if (ih < 0) {

ih = 0.f;

}

return iw*ih;

}

二、NMS

对于检测模型预测出的IOU阈值大于设定阈值的候选框进行非极大值抑制操作得到最终的检测结果。算法过程是一个迭代-遍历-消除的过程，传统的NMS算法流程：

（1）将所有框的得分排序，选中最高分及其对应的框

（2）遍历其余的框，如果和当前最高分框的重叠面积(IOU)大于一定阈值，我们就将框删除

（3）从未处理的框中继续选一个得分最高的，重复上述过程直到没有满足条件的。

1、C++代码实现：

//最好理解的一种实现方式接近真实理论。

bool cmpScore(Bbox lsh, Bbox rsh) {

if (lsh.score < rsh.score)

return true;

else

return false;

}

void nms(std::vector<Bbox> &boundingBox_, const float overlap_threshold, string modelname) {

if (boundingBox_.empty()) {

return;

}

sort(boundingBox_.begin(), boundingBox_.end(), cmpScore);

float IOU = 0;

float maxX = 0;

float maxY = 0;

float minX = 0;

float minY = 0;

std::vector<int> vPick;

int nPick = 0;

std::multimap<float, int> vScores;

const int num_boxes = boundingBox_.size();

vPick.resize(num_boxes);

for (int i = 0; i < num_boxes; ++i) {

vScores.insert(std::pair<float, int>(boundingBox_[i].score, i));

}

while (vScores.size() > 0) {

int last = vScores.rbegin()->second;

vPick[nPick] = last;

nPick += 1;

for (std::multimap<float, int>::iterator it = vScores.begin(); it != vScores.end();) {

int it_idx = it->second;

maxX = std::max(boundingBox_.at(it_idx).x1, boundingBox_.at(last).x1);

maxY = std::max(boundingBox_.at(it_idx).y1, boundingBox_.at(last).y1);

minX = std::min(boundingBox_.at(it_idx).x2, boundingBox_.at(last).x2);

minY = std::min(boundingBox_.at(it_idx).y2, boundingBox_.at(last).y2);

//maxX1 and maxY1 reuse

maxX = ((minX - maxX + 1)>0) ? (minX - maxX + 1) : 0;

maxY = ((minY - maxY + 1)>0) ? (minY - maxY + 1) : 0;

//IOU reuse for the area of two bbox

IOU = maxX * maxY;

if (!modelname.compare("Union"))

IOU = IOU / (boundingBox_.at(it_idx).area + boundingBox_.at(last).area - IOU);

else if (!modelname.compare("Min")) {

IOU = IOU / ((boundingBox_.at(it_idx).area < boundingBox_.at(last).area) ? boundingBox_.at(it_idx).area : boundingBox_.at(last).area);

}

if (IOU > overlap_threshold) {

it = vScores.erase(it);

}

else {

it++;

}

vPick.resize(nPick);

std::vector<Bbox> tmp_;

tmp_.resize(nPick);

for (int i = 0; i < nPick; i++) {

tmp_[i] = boundingBox_[vPick[i]];

}

boundingBox_ = tmp_;

}

2、Python实现的NMS：

#传统nms算法 python版本
def nms_cpu(dets,thresh):
    x1=dets[:,0]
    y1=dets[:,1]
    x2=dets[:,2]
    y2=dets[:,3]
    scores=dets[:,4]
    areas=(x2-x1+1)*(y2-y1+1)
    #从大到小排序，取index
    order = scores.argsort()[::-1]
    #keep为最后保留的边框
    keep=[]
    while order.size>0:
        i=order[0]
        keep.append(i)
        xx1=np.maximum(x1[i],x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])
        w=np.maximum(0.0,xx2-xx1+1)
        h=np.maximum(0.0,yy2-yy1+1)
        inter=w*h
        # 交/并得到iou值
        iou=inter/(areas[i]+areas[order[1:]]-inter)
        # ind为所有与窗口i的iou值小于threshold值的窗口的index，其他窗口此次都被窗口i吸收
        inds= np.where(iou <= thresh)[0]
        # 下一次计算前要把窗口i去除，所有i对应的在order里的位置是0，所以剩下的加1
        order = order[inds + 1]
    return keep

三、SoftNMS

目前大多数目标检测算法仍然使用的传统的非极大值抑制算法，貌似也够用样，但是大神对于传统NMS提出了质疑，认为其可能使得检测框被滤除掉。实际上算法本身已经检测出了目标，但是因为nms简单地将大于IOU阈值的检测框滤除，而被滤除的检测框IOU值仅仅比预先定义的小一点点并且是正确的；如果简单地将阈值调低则又会造成精确度下降

As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss.

传统NMS的抑制过程可表示如下：

基于上述问题作者提出了SoftNMS，SoftNMS仅仅是对抑制IOU大于阈值时做了微妙的处理，即对分数做系数惩罚（decay）。文章提出的解决办法具体为线性惩罚IOU高于阈值的，或者高斯惩罚，即在分数上乘上一个0-1之间的系数。

1）关于IOU的线性函数，该函数并不连续，

2）关于IOU的高斯函数，该函数连续，不会出现断层。实际代码实现也多采用高斯函数。

IOU值越小则惩罚越小（系数越大），越大则惩罚越大（系数越小）

算法伪代码：

其中，D为最终要保留的检测框集合，B为检测框集合（初始状态为网络检测的检测框，softnms阶段会作为中间集合逐步滤除直到为空），M为while循环中分数最大的检测框，每一步中D与M求并集，并且从B中滤除出M。

----摘自 paper:《Improving Object Detection With One Line of Code》

详情参见论文及源码：

https://arxiv.org/pdf/1704.04503.pdf
github:https://github.com/bharatsingh430/soft-nms