Foreword
Open Images V4 is a data set of about 9 million (9M) images that Google opened in 2018. It is divided into a training set, a verification set, and a test set. Annotate with image-level labels, object bounding boxes and visual relationships. Image-level labels: There are nearly 20,000 different categories with labels, some are manually labeled, and some are labeled by machines. Target bounding box: contained in about 1.91 million (1.91M) images, about 15.44 million (15.44M) bounding boxes for 600 categories, which makes it the largest occurrence with object location annotations There are data sets. There are fine classifications (such as human heads, faces, beards, hands, etc.), closer to life (from musical instruments, computers, books to kitchen utensils, etc.) and accurate borders (Google professional annotators manually draw 90% of the boxes on the training set. Other methods semi-automatically generate the remaining 10%. These boxes are manually verified, IoU> 0.7, there is a perfect box on the object, in fact they are accurate (average IoU ~ 0.82)), etc. Features [1]. This data set can accelerate many computer vision tasks for days or even months. For example, if we want to create an object detector for a single or multiple objects, we can download only these types of images and their annotations and start our training process. (This article is mainly for downloading target detection data sets)
ready
- ubuntu
- Disk
- python
- awscli
structure
Open Images V4 entire data set of about 560G, including the training set, validation and test sets. The library shown in the figure includes picture files (.zip) and annotation files (.csv), as shown in the following table.
Let's first take a look at the annotation file (.csv), indicating its data format:
Class Names:
- class-descriptions-boxable.csv-Correspondence of the class names used in the data set to human-understandable names, such as / m / 011k07 for Tortoise, / m / 011q46kg for Container, / m / 012074 for Magpie, etc.
/m/011k07 | Tortoise | |
/m/011q46kg | Container | |
/m/012074 | Magpie | |
/m/0120dh | Sea turtle | |
/m/01226z | Football |
Boxes:
- train-annotations-bbox.csv -The border annotation of the object instance in the training image.
- validation-annotations-bbox.csv -Validate the border annotation of the object instance in the image.
- test-annotations-bbox.csv -Test the border annotation of the object instance in the image.
This type of file contains several items shown in the table, as shown in the following table.
ImageID | Source | LabelName | Confidence | XMin | XMax | YMin | YMax | IsOccluded | IsTruncated | IsGroupOf | IsDepiction | IsInside |
000026e7ee790996 | freeform | /m/07j7r | 1 | 0.071905 | 0.145346 | 0.206591 | 0.391306 | 0 | 1 | 1 | 0 | 0 |
000026e7ee790996 | freeform | /m/07j7r | 1 | 0.439756 | 0.572466 | 0.264153 | 0.435122 | 0 | 1 | 1 | 0 | 0 |
000026e7ee790996 | freeform | /m/07j7r | 1 | 0.668455 | 1 | 0 | 0.552825 | 0 | 1 | 1 | 0 | 0 |
000062a39995e348 | freeform | / M / 015p6 | 1 | 0.205719 | 0.849912 | 0.154144 | 1 | 0 | 0 | 0 | 0 | 0 |
000062a39995e348 | freeform | /m/05s2s | 1 | 0.137133 | 0.377634 | 0 | 0.884185 | 1 | 1 | 0 | 0 | 0 |
0000c64e1253d68f | freeform | /m/07yv9 | 1 | 0 | 0.97385 | 0 | 0.043342 | 0 | 1 | 1 | 0 | 0 |
0000c64e1253d68f | freeform | /m/0k4j | 1 | 0 | 0.513534 | 0.321356 | 0.689661 | 0 | 1 | 0 | 0 | 0 |
ImageID
: The image where this box is located. You can see that an ID appears one or more times, which should indicate how many borders there are on the image.Source
: Indicates how the frame is made: ①freeform
And itxclick
is a frame drawn manually. ②activemil
is the box generated by using the enhanced version of the method.LabelName
: MID of the object class to which the box belongs, corresponding in class-descriptions-boxable.csv .Confidence
: Dummy value, always 1.XMin
,XMax
,YMin
,YMax
: Coordinates of the box, the normalized image coordinates. XMin is in [0,1], where 0 is the leftmost pixel and 1 is the rightmost pixel in the image. The Y coordinate goes from the top pixel 0 to the bottom pixel 1.
The last five items represent attributes, and for each of them, the value 1
indicates presence, 0表示
absence, and -1表示
unknown.
IsOccluded
: Indicates that the object is blocked by another object in the image.IsTruncated
: Indicates that the object exceeds the image boundary.IsGroupOf
: Indicates that the box spans a group of objects (for example, a piece of flowers or a group of people). We asked the annotator to use this label for more than 5 instances, which are severely blocked from each other and are in physical contact.IsDepiction
: Indicates that the object is a depiction (for example, a cartoon or drawing of the object, rather than a real physical instance).IsInside
: Represents a picture taken from inside the subject (for example, inside a car or inside a building).- test-images.csv : indicates the test image path.
- validation-images.csv : indicates the verification image path.
image_name | image_url | ||||||||||
e0c995e9359596dd.jpg | https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/e0c995e9359596dd.jpg | ||||||||||
110487ec7e9be60a.jpg | https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/110487ec7e9be60a.jpg | ||||||||||
90596bf3313e72e3.jpg | https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/90596bf3313e72e3.jpg | ||||||||||
4b3c6afd44adbe59.jpg | https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/4b3c6afd44adbe59.jpg | ||||||||||
69248ebbbea5aa0c.jpg | https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/69248ebbbea5aa0c.jpg |
There are also descriptions of Image Labels, Visual relationships, etc. on the official website . Since no files were found, they will not be expanded here.
As can be seen from the above, the data set downloaded directly on the official website is packaged according to the training set, verification set and test set. There may be times when we do n’t need that much data, or we ca n’t store so much data, or our task just requires a few of them, then we need to download them by category. Taking target detection as an example, we know that there are 600 categories with bounding boxes, and their name list is shown below. Here to thank the author learn OpenCv of SUNITA NAYAK provide us with this form . The space is limited. Only the top 100 are listed here. To know all, please visit https://www.learnopencv.com/fast-image-downloader-for-open-images-v4/
class | train | validation | test |
Person | 1034721 | 13274 | 40861 |
Wheel | 340639 | 11394 | 34604 |
Car | 248075 | 9381 | 28737 |
Human hair | 234057 | 8594 | 26301 |
Clothing | 1438128 | 8527 | 26531 |
Human arm | 208982 | 8341 | 25162 |
Human head | 201633 | 7865 | 25080 |
Footwear | 744474 | 7189 | 21205 |
Human body | 175244 | 6769 | 20246 |
Man | 1418594 | 5654 | 17514 |
Human face | 1037710 | 5170 | 15536 |
Flower | 345296 | 5089 | 15040 |
Mammal | 156154 | 4349 | 13479 |
Human nose | 60142 | 4341 | 12718 |
Human eye | 77233 | 4304 | 13034 |
Tire | 122615 | 4181 | 13177 |
Human hand | 75307 | 4123 | 12505 |
Human leg | 71479 | 4093 | 13334 |
Sports equipment | 44900 | 3951 | 11992 |
Plant | 267913 | 3808 | 11579 |
Tree | 1051344 | 3209 | 10148 |
Auto part | 13586 | 2898 | 8845 |
Woman | 767337 | 2865 | 9047 |
Food | 88422 | 2736 | 8331 |
Land vehicle | 81108 | 2689 | 8480 |
Human mouth | 44197 | 2505 | 7424 |
Girl | 197155 | 2420 | 7479 |
Vehicle | 50959 | 2105 | 7064 |
Dog | 28675 | 1930 | 5818 |
Fruit | 26236 | 1905 | 6215 |
Window | 503467 | 1650 | 5091 |
Airplane | 21285 | 1027 | 3272 |
Fashion accessory | 91024 | 1026 | 3164 |
Baked goods | 23010 | 1020 | 2907 |
Building | 178634 | 984 | 2915 |
Bird | 47921 | 943 | 2751 |
Boat | 79113 | 903 | 2672 |
Human ear | 17774 | 870 | 2611 |
Bicycle wheel | 59521 | 733 | 2018 |
Table | 85691 | 714 | 2279 |
Snack | 37374 | 708 | 2173 |
Book | 41280 | 698 | 2147 |
Furniture | 38527 | 646 | 1893 |
Dessert | 27407 | 645 | 2092 |
Boy | 87555 | 600 | 2031 |
Dress | 52999 | 567 | 1581 |
Fish | 23195 | 564 | 1422 |
Vehicle registration plate | 7852 | 512 | 1570 |
Chair | 132483 | 511 | 1535 |
Vegetable | 18621 | 496 | 1679 |
Fast food | 24991 | 492 | 1599 |
Drink | 40323 | 482 | 1427 |
Helmet | 16502 | 440 | 1275 |
Toy | 70963 | 437 | 1205 |
Bicycle | 40161 | 403 | 1158 |
Jeans | 78473 | 396 | 1433 |
Horse | 13368 | 392 | 1144 |
Cat | 15183 | 381 | 1095 |
Bottle | 40188 | 340 | 979 |
Strawberry | 7944 | 326 | 774 |
Cake | 5784 | 326 | 878 |
Suit | 110848 | 321 | 857 |
Houseplant | 22834 | 319 | 825 |
Sports uniform | 19396 | 315 | 1135 |
Truck | 12135 | 311 | 969 |
Rose | 12053 | 309 | 899 |
Dairy | 8146 | 308 | 970 |
Flowerpot | 22760 | 302 | 659 |
Roller skates | 5476 | 295 | 723 |
Animal | 17442 | 290 | 882 |
Tableware | 41086 | 285 | 936 |
Bread | 3846 | 277 | 911 |
Ball | 6845 | 266 | 902 |
Glasses | 57946 | 262 | 890 |
Palm tree | 42026 | 253 | 620 |
Paddle | 6951 | 253 | 699 |
House | 136152 | 246 | 822 |
Seafood | 3063 | 226 | 689 |
Sculpture | 34533 | 221 | 653 |
Tomato | 6254 | 216 | 722 |
Salad | 3088 | 213 | 605 |
Insect | 8981 | 210 | 717 |
Hat | 13245 | 201 | 557 |
Carnivore | 3501 | 200 | 625 |
Human foot | 2237 | 199 | 467 |
Monkey | 3026 | 195 | 543 |
Wine | 15400 | 193 | 388 |
Shelf | 22899 | 191 | 563 |
Cabinetry | 9191 | 188 | 451 |
Aircraft | 1898 | 186 | 556 |
Drawer | 4414 | 184 | 448 |
Cookie | 4158 | 184 | 636 |
Sandal | 2938 | 181 | 393 |
Musical instrument | 16503 | 178 | 525 |
Orange | 6195 | 175 | 839 |
Juice | 2838 | 174 | 512 |
Motorcycle | 13382 | 173 | 530 |
Lemon | 1756 | 171 | 425 |
Cattle | 11603 | 170 | 450 |
Door | 19256 | 165 | 524 |
有了这个表格,我们就可以根据自己需要的类别来进行下载,而且还知道此类别的图片数,比如我们要下载human head,那么通过脚本编写我们可以实现这个过程。而这个脚本 SUNITA NAYAK也已经实现好了,直接使用就可以。
下载
- 安装用于管理AWS服务的统一工具——AWS命令行界面(CLI)
sudo pip3 install awscli
- 下载boxes和class names的四各.csv文件(推荐直接点前面的链接,然后用迅雷,个人测试这样要快非常多)
wget https://storage.googleapis.com/openimages/2018_04/class-descriptions-boxable.csv
wget https://storage.googleapis.com/openimages/2018_04/train/train-annotations-bbox.csv
wget https://storage.googleapis.com/openimages/2018_04/validation/validation-annotations-bbox.csv
wget https://storage.googleapis.com/openimages/2018_04/test/test-annotations-bbox.csv
- 将四个文件与脚本放在一个文件夹下,并运行脚本。名字有两个单词的用下划线连接。此处可能会有一个错误,如果发生先
执行pip3 install tqdm再运行脚本。
python3 downloadOI.py --classes 'Cheese,Ice_cream,Cookie' --mode train
获取脚本程序downloadOI.py请访问作者github,在次感谢作者
https://github.com/spmallick/learnopencv/blob/master/downloadOpenImages/downloadOI.py
另外,还可以通过在命令行中将它们显式设置为0来添加可选参数以排除某些类型的图像。
--occluded = 0 以排除被遮挡的实例。
--truncated = 0 以排除在边界处截断的实例。
--groupOf = 0 以排除一起表示一组对象的实例。这些实例通常包括一组5个或更多同一物理接触或遮挡的物体,例如一袋苹果。
--depiction = 0 以排除草图或漫画的实例,而不是真实物理对象的图片。
--inside = 0 以排除从对象内部拍摄照片的实例,例如在汽车内部
python3 downloadOI.py --classes 'Cheese,Ice_cream,Cookie' --mode train --groupOf=0 --inside=0
结果
等........等..........等。已经一下午下了7170个文件,共49308个,估计还要下一天一夜。下载的文件包括此类目标的图片和其对应的txt文件,txt文件里是图片上所有此类目标框的坐标。脚本下载的txt文件默认的格式是class、Xmin、Xmax、Ymin、Ymax。从以下代码中得到体现:
with open('%s/%s/%s.txt'%(run_mode,class_name,line_parts[0]),'a') as f:
f.write(','.join([class_name, line_parts[4], line_parts[5], line_parts[6], line_parts[7] ])+'\n')
为了适应Yolo的label结构,<object-class> <x_center> <y_center> <width> <height>我们需要做适当改变,在SUNITA NAYAK的另一篇文章中有体现,以下代码需要做适当调整才可使用。见附录
with open('labels/%s.txt'%(lineParts[0]),'a') as f:
f.write(' '.join([str(ind),str((float(lineParts[5]) + float(lineParts[4]))/2), str((float(lineParts[7]) + float(lineParts[6]))/2), str(float(lineParts[5])-float(lineParts[4])),str(float(lineParts[7])-float(lineParts[6]))])+'\n')
这是txt文件里的信息,妥妥的yolo格式:
2 0.7209375 0.5066665 0.556875 0.608333
5 0.146875 0.7058335 0.115 0.401667
导入yolo_mask标注工具,可以看到下载的标注格式是符合yolo的,成功。
参考
https://storage.googleapis.com/openimages/web/factsfigures.html
https://www.learnopencv.com/fast-image-downloader-for-open-images-v4/
https://www.learnopencv.com/training-yolov3-deep-learning-based-custom-object-detector/
https://blog.csdn.net/wulala789/article/details/80646618
附录(代码为引用、并修改)
#Author : Sunita Nayak, Big Vision LLC
#### Usage example: python3 downloadOI.py --classes 'Ice_cream,Cookie' --mode train
import argparse
import csv
import subprocess
import os
from tqdm import tqdm
import multiprocessing
from multiprocessing import Pool as thread_pool
cpu_count = multiprocessing.cpu_count()
parser = argparse.ArgumentParser(description='Download Class specific images from OpenImagesV4')
parser.add_argument("--mode", help="Dataset category - train, validation or test", required=True)
parser.add_argument("--classes", help="Names of object classes to be downloaded", required=True)
parser.add_argument("--nthreads", help="Number of threads to use", required=False, type=int, default=cpu_count*2)
parser.add_argument("--occluded", help="Include occluded images", required=False, type=int, default=1)
parser.add_argument("--truncated", help="Include truncated images", required=False, type=int, default=1)
parser.add_argument("--groupOf", help="Include groupOf images", required=False, type=int, default=1)
parser.add_argument("--depiction", help="Include depiction images", required=False, type=int, default=1)
parser.add_argument("--inside", help="Include inside images", required=False, type=int, default=1)
args = parser.parse_args()
run_mode = args.mode
threads = args.nthreads
classes = []
for class_name in args.classes.split(','):
classes.append(class_name)
with open('./class-descriptions-boxable.csv', mode='r') as infile:
reader = csv.reader(infile)
dict_list = {rows[1]:rows[0] for rows in reader}
subprocess.run(['rm', '-rf', 'labels'])
subprocess.run([ 'mkdir', 'labels'])
subprocess.run(['rm', '-rf', 'JPEGImages'])
subprocess.run([ 'mkdir', 'JPEGImages'])
pool = thread_pool(threads)
commands = []
cnt = 0
for ind in range(0, len(classes)):
class_name = classes[ind]
print("Class "+str(ind) + " : " + class_name)
subprocess.run([ 'mkdir', run_mode+'/'+class_name])
command = "grep "+dict_list[class_name.replace('_', ' ')] + " ./" + run_mode + "-annotations-bbox.csv"
class_annotations = subprocess.run(command.split(), stdout=subprocess.PIPE).stdout.decode('utf-8')
class_annotations = class_annotations.splitlines()
for line in class_annotations:
line_parts = line.split(',')
#IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside
if (args.occluded==0 and int(line_parts[8])>0):
print("Skipped %s",line_parts[0])
continue
if (args.truncated==0 and int(line_parts[9])>0):
print("Skipped %s",line_parts[0])
continue
if (args.groupOf==0 and int(line_parts[10])>0):
print("Skipped %s",line_parts[0])
continue
if (args.depiction==0 and int(line_parts[11])>0):
print("Skipped %s",line_parts[0])
continue
if (args.inside==0 and int(line_parts[12])>0):
print("Skipped %s",line_parts[0])
continue
cnt = cnt + 1
command = 'aws s3 --no-sign-request --only-show-errors cp s3://open-images-dataset/'+run_mode+'/'+line_parts[0]+'.jpg '+ 'JPEGImages'+'/'+class_name+'/'+line_parts[0]+'.jpg'
commands.append(command)
with open('labels/%s.txt'%(line_parts[0]),'a') as f:
f.write(' '.join([str(ind), str((float(line_parts[5]) + float(line_parts[4]))/2), str((float(line_parts[7]) + float(line_parts[6]))/2), str(float(line_parts[5])-float(line_parts[4])), str(float(line_parts[7])-float(line_parts[6]))])+'\n')
print("Annotation Count : "+str(cnt))
commands = list(set(commands))
print("Number of images to be downloaded : "+str(len(commands)))
list(tqdm(pool.imap(os.system, commands), total = len(commands) ))
pool.close()
pool.join()