I. Introduction
The Ms CoCo data set is a very large and commonly used data set. It can do tasks such as target detection, image segmentation, image description, etc.
Dataset address: link
Address of the paper describing the dataset: link
One thing to note: the object category of the data set is divided into two categories: 80 and 91, of which object80 is a subset of stuff91, and stuff that does not belong to object refers to materials and objects without clear boundaries, such as sky and grass wait.
When learning target detection, we can use object80 as a classification category
Comparing MS coco with PASCAL VOC, it can be seen that the number of CoCo datasets is significantly larger, and there are more real values of the same category, so the pre-trained model of the CoCo dataset is often used as the initialization of its own model. (Pre-training on the CoCo dataset takes a long time)
2. Dataset download
The test set is not divided here because the file structure of the test set division is actually the same as that of the verification set, so you can test without dividing the test set, and use the verification set to achieve the purpose of testing.
3. Check the verification set and annotation files
3.1 Use python's json library to view
If the training set is relatively large, there is no download. Here we mainly look at the test set and annotation files
import json
json_path = "./annotations/instances_val2017.json"
with open(json_path, "r") as f:
json_file = json.load(f)
print(json_file["info"])
Setting a breakpoint, we can get the following:
json_file is a dictionary, including: info, licenses, images, annotations, categories five keywords
key |
value type |
value |
info |
dict |
description、url、version、year、contribute、data_created |
licences |
list |
len = 8 |
images |
list |
include 5000 dicts every dict = {licences、filename、coco_url、height、width、date_captured、flickr_url、id} |
annotations |
list |
include 36871dicts every dict = {segmentation、area、iscrowd、image_id、bbox、categoriy_id、id} |
categories |
list |
len = 80 include supercategory name id |
in:
images is a list (the number of elements corresponds to the number of images), and each element in the list is a dict , corresponding to the relevant information of a picture. Including the corresponding image name , image width , height and other information.
Annoatations is a list (the number of elements corresponds to the number of objects in the picture), and each element of the list is a dict , which represents the annotation information of each object. Including segmentation information , target detection frame (the four numbers in the frame represent the x and y coordinates of the upper left corner point respectively, and the next two numbers are the width and height of the frame. id is the corresponding image id, category_id is the type id of the object type .iscrowd indicates whether this is a single target object, 0 for a single object, 1 for a collection of objects )
categories is a list (the number of elements corresponds to the number of categories of the detection target). Each element in the list is a dict corresponding to the target information of a category. Including category id , category name and superclass (that is, the meaning of the parent class)
3.2. Use the official API to view
3.2.1 View target detection labeling information
import os
from pycocotools.coco import COCO
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
json_path = "./annotations/instances_val2017.json"
img_path = "./val2017"
# load coco data
coco = COCO(annotation_file=json_path)
# get all image index info
ids = list(sorted(coco.imgs.keys()))
print("number of images: {}".format(len(ids)))
# get all coco class labels
coco_classes = dict([(v["id"], v["name"]) for k, v in coco.cats.items()])
# 遍历前三张图像
for img_id in ids[:3]:
# 获取对应图像id的所有annotations idx信息
ann_ids = coco.getAnnIds(imgIds=img_id)
# 根据annotations idx信息获取所有标注信息
targets = coco.loadAnns(ann_ids)
# get image file name
path = coco.loadImgs(img_id)[0]['file_name']
# read image
img = Image.open(os.path.join(img_path, path)).convert('RGB')
draw = ImageDraw.Draw(img)
# draw box to image
for target in targets:
x, y, w, h = target["bbox"]
x1, y1, x2, y2 = x, y, int(x + w), int(y + h)
draw.rectangle((x1, y1, x2, y2))
draw.text((x1, y1), coco_classes[target["category_id"]])
# show image
plt.imshow(img)
plt.show()
3.2.2 View segmentation annotation
code:
import os
import random
import numpy as np
from pycocotools.coco import COCO
from pycocotools import mask as coco_mask
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
random.seed(0)
json_path = "./annotations/instances_val2017.json"
img_path = "./val2017"
# random pallette
pallette = [0, 0, 0] + [random.randint(0, 255) for _ in range(255*3)]
# load coco data
coco = COCO(annotation_file=json_path)
# get all image index info
ids = list(sorted(coco.imgs.keys()))
print("number of images: {}".format(len(ids)))
# get all coco class labels
coco_classes = dict([(v["id"], v["name"]) for k, v in coco.cats.items()])
# 遍历前三张图像
for img_id in ids[:3]:
# 获取对应图像id的所有annotations idx信息
ann_ids = coco.getAnnIds(imgIds=img_id)
# 根据annotations idx信息获取所有标注信息
targets = coco.loadAnns(ann_ids)
# get image file name
path = coco.loadImgs(img_id)[0]['file_name']
# read image
img = Image.open(os.path.join(img_path, path)).convert('RGB')
img_w, img_h = img.size
masks = []
cats = []
for target in targets:
cats.append(target["category_id"]) # get object class id
polygons = target["segmentation"] # get object polygons
rles = coco_mask.frPyObjects(polygons, img_h, img_w)
mask = coco_mask.decode(rles)
if len(mask.shape) < 3:
mask = mask[..., None]
mask = mask.any(axis=2)
masks.append(mask)
cats = np.array(cats, dtype=np.int32)
if masks:
masks = np.stack(masks, axis=0)
else:
masks = np.zeros((0, height, width), dtype=np.uint8)
# merge all instance masks into a single segmentation map
# with its corresponding categories
target = (masks * cats[:, None, None]).max(axis=0)
# discard overlapping instances
target[masks.sum(0) > 1] = 255
target = Image.fromarray(target.astype(np.uint8))
target.putpalette(pallette)
plt.imshow(target)
plt.show()
3.2.3 View key point annotations
code:
import numpy as np
from pycocotools.coco import COCO
json_path = "./annotations/person_keypoints_val2017.json"
coco = COCO(json_path)
img_ids = list(sorted(coco.imgs.keys()))
# 遍历前5张图片中的人体关键点信息(注意,并不是每张图片里都有人体信息)
for img_id in img_ids[:5]:
idx = 0
img_info = coco.loadImgs(img_id)[0]
ann_ids = coco.getAnnIds(imgIds=img_id)
anns = coco.loadAnns(ann_ids)
for ann in anns:
xmin, ymin, w, h = ann['bbox']
# 打印人体bbox信息
print(f"[image id: {img_id}] person {idx} bbox: [{xmin:.2f}, {ymin:.2f}, {xmin + w:.2f}, {ymin + h:.2f}]")
keypoints_info = np.array(ann["keypoints"]).reshape([-1, 3])
visible = keypoints_info[:, 2]
keypoints = keypoints_info[:, :2]
# 打印关键点信息以及可见度信息
print(f"[image id: {img_id}] person {idx} keypoints: {keypoints.tolist()}")
print(f"[image id: {img_id}] person {idx} keypoints visible: {visible.tolist()}")
idx += 1
output:
1 2 3 4 5 6 7 8 9 10 |
loading annotations into memory... Done (t=0.34s) creating index... index created! [image id: 139] person 0 bbox: [412.80, 157.61, 465.85, 295.62] [image id: 139] person 0 keypoints: [[427, 170], [429, 169], [0, 0], [434, 168], [0, 0], [441, 177], [446, 177], [437, 200], [430, 206], [430, 220], [420, 215], [445, 226], [452, 223], [447, 260], [454, 257], [455, 290], [459, 286]] [image id: 139] person 0 keypoints visible: [1, 2, 0, 2, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [image id: 139] person 1 bbox: [384.43, 172.21, 399.55, 207.95] [image id: 139] person 1 keypoints: [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]] [image id: 139] person 1 keypoints visible: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] |
3.3. CoCo Official Code API Interpretation
The above three pieces of code visualize bbox, segmentation, and keypoint respectively, mainly using a series of official APIs, and then interpret the code of the official API (take bbox visualization as an example)
coco = COCO(annotation_file=json_path)
This line of code creates a CoCo object, annotation_file is passed in as the path of the json file, and is mainly used to read the json file and visualize annotations
In the initialization function:
Create four empty dictionaries are dataset, anns, cats, imgs, and create two default dictionaries (study notes of the default dictionary) default dictionary CSDN link
Then judge whether the path is empty. If it is not empty, use the json library method to download the annotation file. The downloaded dataset format is dict
Then call the create index function
In this method, create three empty dictionaries anns, cats, and imgs, and then judge whether there are keywords of annotation, images, and categories in the dictionary
For dataset["annotation"], there is a list of all objects, each element is a dictionary, and each image name id list in the imgToAnns dictionary records all objects of this image
anns dictionary each image serial number id keyword records marked object information dictionary
For dataset["images"] is a dictionary containing all images, and the imgs dictionary records the image information dictionary of each image name id
For dataset["categories"] is a list of dictionaries, the cats dictionary records the category information of each category id
The catToImgs dictionary records the image name id for each category id
ids = list(sorted(coco.imgs.keys()))
ids is a sorted list of image names and ids
coco_classes = dict([(v["id"], v["name"]) for k, v in coco.cats.items()])
coco_classes actually extracts the category id and category name for each dictionary in the categories dictionary list to form a dictionary
I feel that the level below is not enough, and I will come back when I become stronger.
4. Verify the target detection task MAP
4.1 Prediction result output format
Object Detection Prediction Format
Suppose we have the following prediction results (the prediction results obtained by the original blogger training)
We save it as predict_results.json file
Then execute the following code to compare the predicted value with ground_truth
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
# accumulate predictions from all images
# 载入coco2017验证集标注文件
coco_true = COCO(annotation_file="./annotations/instances_val2017.json")
# 载入网络在coco2017验证集上预测的结果
coco_pre = coco_true.loadRes('./predict/predict_results.json')
coco_evaluator = COCOeval(cocoGt=coco_true, cocoDt=coco_pre, iouType="bbox")
coco_evaluator.evaluate()
coco_evaluator.accumulate()
coco_evaluator.summarize()
Get the output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
loading annotations into memory... Done (t=0.71s) creating index... index created! Loading and preparing results... DONE (t=0.79s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=19.72s). Accumulating evaluation results... DONE (t=3.82s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.233 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.415 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.233 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.104 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.262 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.323 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.216 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.319 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.145 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.361 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.463 |
下一篇记录目标检测的评估指标:CoCo数据集-目标检测指标MAP_SL1029_的博客-CSDN博客
五、学习的博客与视频
MS COCO数据集介绍以及pycocotools简单使用_太阳花的小绿豆的博客-CSDN博客
COCO数据集介绍以及pycocotools简单使用_哔哩哔哩_bilibili
CoCo dataset official website: COCO - Common Objects in Context (cocodataset.org)
CoCo official API address: cocodataset/cocoapi: COCO API - Dataset @ http://cocodataset.org/ (github.com)