COCO数据集做语义分割

要使用COCO数据分割数据集，先要配置API，参见我的上一篇博客

我对COCO数据集的一些理解：

图片都是一样的，annotations不一样（就是包含注释信息的json文件），有captions,instances,person_keypoints,stuff四大类，每一类都分别包含train和val。我们常用的三个分别是

instances——实例分割
person_keypoints——关键点分割
stuff——语义分割
captions——图片的配文，是一句话（这个不常用）

关于实例分割，上一篇博客里有写，这里主要说说语义分割的几个点。

实例分割的category_id是从1-90，一共有80个类别，对应关系如_cat_id_to_real_id函数所示：

# 代码来自Mask RCNN
def _cat_id_to_real_id(readId):
  """Note coco has 80 classes, but the catId ranges from 1 to 90!"""
  cat_id_to_real_id = \
    {1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11, 13: 12, 14: 13, 15: 14, 16: 15, 17: 16,
     18: 17, 19: 18, 20: 19, 21: 20, 22: 21, 23: 22, 24: 23, 25: 24, 27: 25, 28: 26, 31: 27, 32: 28, 33: 29, 34: 30,
     35: 31, 36: 32, 37: 33, 38: 34, 39: 35, 40: 36, 41: 37, 42: 38, 43: 39, 44: 40, 46: 41, 47: 42, 48: 43, 49: 44,
     50: 45, 51: 46, 52: 47, 53: 48, 54: 49, 55: 50, 56: 51, 57: 52, 58: 53, 59: 54, 60: 55, 61: 56, 62: 57, 63: 58,
     64: 59, 65: 60, 67: 61, 70: 62, 72: 63, 73: 64, 74: 65, 75: 66, 76: 67, 77: 68, 78: 69, 79: 70, 80: 71, 81: 72,
     82: 73, 84: 74, 85: 75, 86: 76, 87: 77, 88: 78, 89: 79, 90: 80}
  return cat_id_to_real_id[readId]

由于本人需要做的是语义分割，因此选用Stuff。首先要理解things和stuff的区别，2018CVPR论文《COCO-Stuff: Thing and Stuff Classes in Context》里是这么写的：

Defining things and stuff. The literature provides definitions for several aspects of stuff and things, including:(1) Shape: Things have characteristic shapes (car, cat,phone), whereas stuff is amorphous (sky, grass, water)[21, 59, 28, 51, 55, 39, 17, 14]. (2) Size: Things occur at characteristic sizes with little variance, whereas stuff regions are highly variable in size [21, 2, 27]. (3) Parts: Thing classes have identifiable parts [56, 19], whereas stuff classes do not (e.g. a piece of grass is still grass, but a wheel is not a car). (4) Instances: Stuff classes are typically not countable [2] and have no clearly defined instances [14, 25, 53]. (5) Texture: Stuff classes are typically highly textured [21, 27, 51, 14]. Finally, a few classes can be interpreted as both stuff and things, depending on the image conditions (e.g. a large number of people is sometimes considered a crowd).

COCO-Stuff labels的组成：

contains 172 classes: 80 thing, 91 stuff, and 1 class unlabeled.The 80 thing classes are the same as in COCO [35]. The 91 stuff classes are curated by an expert annotator. The class unlabeled is used in two situations: if a label does not belong to any of the 171 predefined classes, or if the annotator cannot infer the label of a pixel.

是从92-182，一共91个类别。

COCO数据集做语义分割

猜你喜欢