COCO2014/2017 data set

The full name of MS COCO is Microsoft Common Objects in Context, which is a data set provided by the Microsoft team that can be used for image recognition.

The data set mainly includes labeled and unlabeled data:

2014: training set + validation set + test set
2015: test set
2017: training set + validation set + test set + unlabeled

Introduction

The COCO dataset is a large and rich object detection, segmentation and captioning dataset. This data set aims at scene understanding, which is mainly intercepted from complex daily scenes. The target in the image is calibrated by precise segmentation. The image includes 91 types of targets, 328,000 images and 2,500,000 labels. There is the largest dataset of semantic segmentation so far. There are 80 categories and more than 330,000 images, 200,000 of which are labeled. The number of individuals in the entire dataset exceeds 1.5 million.

coco2017：
Insert picture description here
coco2014

80 categories

person(人)

bicycle(自行车) car(汽车) motorbike(摩托车) aeroplane(飞机) bus(公共汽车) train(火车) truck(卡车) boat(船)

traffic light(信号灯) fire hydrant(消防栓) stop sign(停车标志) parking meter(停车计费器) bench(长凳)

bird(鸟) cat(猫) dog(狗) horse(马) sheep(羊) cow(牛) elephant(大使用和下载象) bear(熊) zebra(斑马) giraffe(长颈鹿)

backpack(背包) umbrella(雨伞) handbag(手提包) tie(领带) suitcase(手提箱)

frisbee(飞盘) skis(滑雪板双脚) snowboard(滑雪板) sports ball(运动球) kite(风筝) baseball bat(棒球棒) baseball glove(棒球手套) skateboard(滑板) surfboard(冲浪板) tennis racket(网球拍)

bottle(瓶子) wine glass(高脚杯) cup(茶杯) fork(叉子) knife(刀)

spoon(勺子) bowl(碗)

banana(香蕉) apple(苹果) sandwich(三明治) orange(橘子) broccoli(西兰花) carrot(胡萝卜) hot dog(热狗) pizza(披萨) donut(甜甜圈) cake(蛋糕)

chair(椅子) sofa(沙发) pottedplant(盆栽植物) bed(床) diningtable(餐桌) toilet(厕所) tvmonitor(电视机)

laptop(笔记本) mouse(鼠标) remote(遥控器) keyboard(键盘) cell phone(电话)

microwave(微波炉) oven(烤箱) toaster(烤面包器) sink(水槽) refrigerator(冰箱)

book(书) clock(闹钟) vase(花瓶) scissors(剪刀) teddy bear(泰迪熊) hair drier(吹风机) toothbrush(牙刷)

COCO data set label file.json

COCO has 5 types of annotations, namely: object detection, key point detection, instance segmentation, panoramic segmentation, and image annotation, all of which correspond to a json file. .jsonThe essence of the file is a dictionary.

Read file

import json

filedir = "instances_val2014.json"
annos = json.loads(open(filedir).read())
print(type(annos))  # <class 'dict'>
print(len(annos))  # 5
print(annos.keys())   # 键
print(annos["info"])   # 键值
print(annos["images"]) 
print(annos["licenses"])
print(annos["annotations"])
print(annos["categories"])

Both contain the following keywords:

{
    
    
	"info" : info,
	"images" : [image], 
	"annotations" : [annotation], 
	"licenses" : [license],
}

The content of info corresponding to the key value is:

{
    
    'description': 'This is stable 1.0 version of the 2014 MS COCO dataset.', 
'url': 'http://mscoco.org', 
'version': '1.0', 
'year': 2014, 
'contributor': 'Microsoft COCO group',
 'date_created': '2015-01-27 09:11:52.357475'
 }

The part of the image corresponding to the key value is:

 {
    
    'license': 3, 
 'file_name': 'COCO_val2014_000000016744.jpg', 
 'coco_url': 'http://mscoco.org/images/16744', 
 'height': 335, 
 'width': 500, 
 'date_captured': '2013-11-20 14:29:03', 
 'flickr_url': 'http://farm3.staticflickr.com/2393/2228750191_11de3ec047_z.jpg', 
 'id': 16744
 },
 ..... 不断的重复 其他相同格式的数据

The contents of the corresponding key value of licenses are:

[{
    
    'url': 'http://creativecommons.org/licenses/by-nc-sa/2.0/', 'id': 1, 'name': 'Attribution-NonCommercial-ShareAlike License'}, 
{
    
    'url': 'http://creativecommons.org/licenses/by-nc/2.0/',     'id': 2, 'name': 'Attribution-NonCommercial License'}, 
{
    
    'url': 'http://creativecommons.org/licenses/by-nc-nd/2.0/',   id': 3, 'name': 'Attribution-NonCommercial-NoDerivs License'},
{
    
    'url': 'http://creativecommons.org/licenses/by/2.0/',        'id': 4, 'name': 'Attribution License'}, 
{
    
    'url': 'http://creativecommons.org/licenses/by-sa/2.0/',     'id': 5, 'name': 'Attribution-ShareAlike License'},
{
    
    'url': 'http://creativecommons.org/licenses/by-nd/2.0/',     'id': 6, 'name': 'Attribution-NoDerivs License'},
{
    
    'url': 'http://flickr.com/commons/usage/',                   'id': 7, 'name': 'No known copyright restrictions'},
{
    
    'url': 'http://www.usa.gov/copyright.shtml',                 'id': 8, 'name': 'United States Government Work'}
]

Although each json file has "info", "images", "annotations", "licenses" keywords, the forms of annotations in the json file corresponding to different tasks are different.

download link

Images：
2014 Train images： http://images.cocodataset.org/zips/train2014.zip
2014 Val images： http://images.cocodataset.org/zips/val2014.zip
2014 Test images： http://images.cocodataset.org/zips/test2014.zip
2015 Test images ： http://images.cocodataset.org/zips/test2015.zip
2017 Train images ：http://images.cocodataset.org/zips/train2017.zip
2017 Val images ： http://images.cocodataset.org/zips/val2017.zip
2017 Test images ： http://images.cocodataset.org/zips/test2017.zip
2017 Unlabeled images ：http://images.cocodataset.org/zips/unlabeled2017.zip

Annotations：
2014 Train/Val annotations：
http://images.cocodataset.org/annotations/annotations_trainval2014.zip
2014 Testing Image info：
http://images.cocodataset.org/annotations/image_info_test2014.zip
2015 Testing Image info：
http://images.cocodataset.org/annotations/image_info_test2015.zip
2017 Train/Val annotations：
http://images.cocodataset.org/annotations/annotations_trainval2017.zip
2017 Stuff Train/Val annotations：
http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip
2017 Testing Image info：
http://images.cocodataset.org/annotations/image_info_test2017.zip
2017 Unlabeled Image info：
http://images.cocodataset.org/annotations/image_info_unlabeled2017.zip

Reference (thanks)
https://blog.csdn.net/weixin_42419002/article/details/100156688
https://blog.csdn.net/qq_43211132/article/details/106850843
https://blog.csdn.net/u013832707/ article/details/93710810