Prepare yolov3 on the self-built data set (1): Download the Open Images V4 data set quickly and accurately according to the object type (single type & multi type), save it in yolo3 annotation format and verify with yolo_mark (with python script).

Foreword


        Open Images V4 is a data set of about 9 million (9M) images that Google opened in 2018. It is divided into a training set, a verification set, and a test set. Annotate with image-level labels, object bounding boxes and visual relationships. Image-level labels: There are nearly 20,000 different categories with labels, some are manually labeled, and some are labeled by machines. Target bounding box: contained in about 1.91 million (1.91M) images, about 15.44 million (15.44M) bounding boxes for 600 categories, which makes it the largest occurrence with object location annotations There are data sets. There are fine classifications (such as human heads, faces, beards, hands, etc.), closer to life (from musical instruments, computers, books to kitchen utensils, etc.) and accurate borders (Google professional annotators manually draw 90% of the boxes on the training set. Other methods semi-automatically generate the remaining 10%. These boxes are manually verified, IoU> 0.7, there is a perfect box on the object, in fact they are accurate (average IoU ~ 0.82)), etc. Features [1]. This data set can accelerate many computer vision tasks for days or even months. For example, if we want to create an object detector for a single or multiple objects, we can download only these types of images and their annotations and start our training process. (This article is mainly for downloading target detection data sets)

               


ready


  • ubuntu
  • Disk
  • python
  • awscli     

structure


    Open Images V4 entire data set of about 560G, including the training set, validation and test sets. The library shown in the figure includes picture files (.zip) and annotation files (.csv), as shown in the following table.

                            

    Let's first take a look at the annotation file (.csv), indicating its data format:

    Class Names:

/m/011k07 Tortoise  
/m/011q46kg Container
/m/012074 Magpie  
/m/0120dh Sea turtle  
/m/01226z Football  

 

    Boxes:

        This type of file contains several items shown in the table, as shown in the following table.

ImageID Source LabelName Confidence XMin XMax YMin YMax IsOccluded IsTruncated IsGroupOf IsDepiction IsInside
000026e7ee790996 freeform /m/07j7r 1 0.071905 0.145346 0.206591 0.391306 0 1 1 0 0
000026e7ee790996 freeform /m/07j7r 1 0.439756 0.572466 0.264153 0.435122 0 1 1 0 0
000026e7ee790996 freeform /m/07j7r 1 0.668455 1 0 0.552825 0 1 1 0 0
000062a39995e348 freeform / M / 015p6 1 0.205719 0.849912 0.154144 1 0 0 0 0 0
000062a39995e348 freeform /m/05s2s 1 0.137133 0.377634 0 0.884185 1 1 0 0 0
0000c64e1253d68f freeform /m/07yv9 1 0 0.97385 0 0.043342 0 1 1 0 0
0000c64e1253d68f freeform /m/0k4j 1 0 0.513534 0.321356 0.689661 0 1 0 0 0
  • ImageID: The image where this box is located. You can see that an ID appears one or more times, which should indicate how many borders there are on the image.
  • Source: Indicates how the frame is made: ①freeform And it xclickis a frame drawn manually. activemilis the box generated by using the enhanced version of the method.
  • LabelName: MID of the object class to which the box belongs, corresponding in class-descriptions-boxable.csv .
  • Confidence: Dummy value, always 1.
  • XMin, XMax, YMin, YMax: Coordinates of the box, the normalized image coordinates. XMin is in [0,1], where 0 is the leftmost pixel and 1 is the rightmost pixel in the image. The Y coordinate goes from the top pixel 0 to the bottom pixel 1.

   The last five items represent attributes, and for each of them, the value 1indicates presence, 0表示absence, and -1表示unknown.

  • IsOccluded: Indicates that the object is blocked by another object in the image.
  • IsTruncated: Indicates that the object exceeds the image boundary.
  • IsGroupOf: Indicates that the box spans a group of objects (for example, a piece of flowers or a group of people). We asked the annotator to use this label for more than 5 instances, which are severely blocked from each other and are in physical contact.
  • IsDepiction: Indicates that the object is a depiction (for example, a cartoon or drawing of the object, rather than a real physical instance).
  • IsInside: Represents a picture taken from inside the subject (for example, inside a car or inside a building).
  • test-images.csv : indicates the test image path.
  • validation-images.csv  : indicates the verification image path.
image_name image_url                  
e0c995e9359596dd.jpg https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/e0c995e9359596dd.jpg
110487ec7e9be60a.jpg https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/110487ec7e9be60a.jpg
90596bf3313e72e3.jpg https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/90596bf3313e72e3.jpg
4b3c6afd44adbe59.jpg https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/4b3c6afd44adbe59.jpg
69248ebbbea5aa0c.jpg https://requestor-proxy.figure-eight.com/figure_eight_datasets/open-images/validation/69248ebbbea5aa0c.jpg

        There are also descriptions of Image Labels, Visual relationships, etc. on the official website . Since no files were found, they will not be expanded here.

        As can be seen from the above, the data set downloaded directly on the official website is packaged according to the training set, verification set and test set. There may be times when we do n’t need that much data, or we ca n’t store so much data, or our task just requires a few of them, then we need to download them by category. Taking target detection as an example, we know that there are 600 categories with bounding boxes, and their name list is shown below. Here to thank the author learn OpenCv of  SUNITA NAYAK provide us with this form . The space is limited. Only the top 100 are listed here. To know all, please visit https://www.learnopencv.com/fast-image-downloader-for-open-images-v4/

class train validation test
Person 1034721 13274 40861
Wheel 340639 11394 34604
Car 248075 9381 28737
Human hair 234057 8594 26301
Clothing 1438128 8527 26531
Human arm 208982 8341 25162
Human head 201633 7865 25080
Footwear 744474 7189 21205
Human body 175244 6769 20246
Man 1418594 5654 17514
Human face 1037710 5170 15536
Flower 345296 5089 15040
Mammal 156154 4349 13479
Human nose 60142 4341 12718
Human eye 77233 4304 13034
Tire 122615 4181 13177
Human hand 75307 4123 12505
Human leg 71479 4093 13334
Sports equipment 44900 3951 11992
Plant 267913 3808 11579
Tree 1051344 3209 10148
Auto part 13586 2898 8845
Woman 767337 2865 9047
Food 88422 2736 8331
Land vehicle 81108 2689 8480
Human mouth 44197 2505 7424
Girl 197155 2420 7479
Vehicle 50959 2105 7064
Dog 28675 1930 5818
Fruit 26236 1905 6215
Window 503467 1650 5091
Airplane 21285 1027 3272
Fashion accessory 91024 1026 3164
Baked goods 23010 1020 2907
Building 178634 984 2915
Bird 47921 943 2751
Boat 79113 903 2672
Human ear 17774 870 2611
Bicycle wheel 59521 733 2018
Table 85691 714 2279
Snack 37374 708 2173
Book 41280 698 2147
Furniture 38527 646 1893
Dessert 27407 645 2092
Boy 87555 600 2031
Dress 52999 567 1581
Fish 23195 564 1422
Vehicle registration plate 7852 512 1570
Chair 132483 511 1535
Vegetable 18621 496 1679
Fast food 24991 492 1599
Drink 40323 482 1427
Helmet 16502 440 1275
Toy 70963 437 1205
Bicycle 40161 403 1158
Jeans 78473 396 1433
Horse 13368 392 1144
Cat 15183 381 1095
Bottle 40188 340 979
Strawberry 7944 326 774
Cake 5784 326 878
Suit 110848 321 857
Houseplant 22834 319 825
Sports uniform 19396 315 1135
Truck 12135 311 969
Rose 12053 309 899
Dairy 8146 308 970
Flowerpot 22760 302 659
Roller skates 5476 295 723
Animal 17442 290 882
Tableware 41086 285 936
Bread 3846 277 911
Ball 6845 266 902
Glasses 57946 262 890
Palm tree 42026 253 620
Paddle 6951 253 699
House 136152 246 822
Seafood 3063 226 689
Sculpture 34533 221 653
Tomato 6254 216 722
Salad 3088 213 605
Insect 8981 210 717
Hat 13245 201 557
Carnivore 3501 200 625
Human foot 2237 199 467
Monkey 3026 195 543
Wine 15400 193 388
Shelf 22899 191 563
Cabinetry 9191 188 451
Aircraft 1898 186 556
Drawer 4414 184 448
Cookie 4158 184 636
Sandal 2938 181 393
Musical instrument 16503 178 525
Orange 6195 175 839
Juice 2838 174 512
Motorcycle 13382 173 530
Lemon 1756 171 425
Cattle 11603 170 450
Door 19256 165 524

        有了这个表格,我们就可以根据自己需要的类别来进行下载,而且还知道此类别的图片数,比如我们要下载human head,那么通过脚本编写我们可以实现这个过程。而这个脚本 SUNITA NAYAK也已经实现好了,直接使用就可以。


下载


  • 安装用于管理AWS服务的统一工具——AWS命令行界面(CLI)
sudo pip3 install awscli
  • 下载boxes和class names的四各.csv文件(推荐直接点前面的链接,然后用迅雷,个人测试这样要快非常多)
wget https://storage.googleapis.com/openimages/2018_04/class-descriptions-boxable.csv
 
wget https://storage.googleapis.com/openimages/2018_04/train/train-annotations-bbox.csv
 
wget https://storage.googleapis.com/openimages/2018_04/validation/validation-annotations-bbox.csv
 
wget https://storage.googleapis.com/openimages/2018_04/test/test-annotations-bbox.csv
  • 将四个文件与脚本放在一个文件夹下,并运行脚本。名字有两个单词的用下划线连接。此处可能会有一个错误,如果发生先

       执行pip3 install tqdm再运行脚本。

python3 downloadOI.py --classes 'Cheese,Ice_cream,Cookie' --mode train

   获取脚本程序downloadOI.py请访问作者github,在次感谢作者

https://github.com/spmallick/learnopencv/blob/master/downloadOpenImages/downloadOI.py

     另外,还可以通过在命令行中将它们显式设置为0来添加可选参数以排除某些类型的图像。

        --occluded = 0      以排除被遮挡的实例。

        --truncated = 0     以排除在边界处截断的实例。

        --groupOf = 0       以排除一起表示一组对象的实例。这些实例通常包括一组5个或更多同一物理接触或遮挡的物体,例如一袋苹果。

        --depiction = 0     以排除草图或漫画的实例,而不是真实物理对象的图片。

        --inside = 0          以排除从对象内部拍摄照片的实例,例如在汽车内部

python3 downloadOI.py --classes 'Cheese,Ice_cream,Cookie' --mode train --groupOf=0 --inside=0

结果


     等........等..........等。已经一下午下了7170个文件,共49308个,估计还要下一天一夜。下载的文件包括此类目标的图片和其对应的txt文件,txt文件里是图片上所有此类目标框的坐标。脚本下载的txt文件默认的格式是class、Xmin、Xmax、Ymin、Ymax。从以下代码中得到体现:

 with open('%s/%s/%s.txt'%(run_mode,class_name,line_parts[0]),'a') as f:  
 f.write(','.join([class_name, line_parts[4], line_parts[5], line_parts[6], line_parts[7] ])+'\n')

 为了适应Yolo的label结构,<object-class> <x_center> <y_center> <width> <height>我们需要做适当改变,在SUNITA NAYAK的另一篇文章中有体现,以下代码需要做适当调整才可使用。见附录

with open('labels/%s.txt'%(lineParts[0]),'a') as f:
f.write(' '.join([str(ind),str((float(lineParts[5]) + float(lineParts[4]))/2), str((float(lineParts[7]) + float(lineParts[6]))/2), str(float(lineParts[5])-float(lineParts[4])),str(float(lineParts[7])-float(lineParts[6]))])+'\n')





 

                                       

        这是txt文件里的信息,妥妥的yolo格式:

        2 0.7209375 0.5066665 0.556875 0.608333
        5 0.146875 0.7058335 0.115 0.401667

        导入yolo_mask标注工具,可以看到下载的标注格式是符合yolo的,成功。

         


参考


https://storage.googleapis.com/openimages/web/factsfigures.html

https://www.learnopencv.com/fast-image-downloader-for-open-images-v4/

https://www.learnopencv.com/training-yolov3-deep-learning-based-custom-object-detector/

https://blog.csdn.net/wulala789/article/details/80646618


附录(代码为引用、并修改)


#Author : Sunita Nayak, Big Vision LLC

#### Usage example: python3 downloadOI.py --classes 'Ice_cream,Cookie' --mode train

import argparse
import csv
import subprocess
import os
from tqdm import tqdm
import multiprocessing
from multiprocessing import Pool as thread_pool

cpu_count = multiprocessing.cpu_count()

parser = argparse.ArgumentParser(description='Download Class specific images from OpenImagesV4')
parser.add_argument("--mode", help="Dataset category - train, validation or test", required=True)
parser.add_argument("--classes", help="Names of object classes to be downloaded", required=True)
parser.add_argument("--nthreads", help="Number of threads to use", required=False, type=int, default=cpu_count*2)
parser.add_argument("--occluded", help="Include occluded images", required=False, type=int, default=1)
parser.add_argument("--truncated", help="Include truncated images", required=False, type=int, default=1)
parser.add_argument("--groupOf", help="Include groupOf images", required=False, type=int, default=1)
parser.add_argument("--depiction", help="Include depiction images", required=False, type=int, default=1)
parser.add_argument("--inside", help="Include inside images", required=False, type=int, default=1)

args = parser.parse_args()

run_mode = args.mode

threads = args.nthreads

classes = []
for class_name in args.classes.split(','):
    classes.append(class_name)

with open('./class-descriptions-boxable.csv', mode='r') as infile:
    reader = csv.reader(infile)
    dict_list = {rows[1]:rows[0] for rows in reader}

subprocess.run(['rm', '-rf', 'labels'])
subprocess.run([ 'mkdir', 'labels'])

subprocess.run(['rm', '-rf', 'JPEGImages'])
subprocess.run([ 'mkdir', 'JPEGImages'])

pool = thread_pool(threads)
commands = []
cnt = 0

for ind in range(0, len(classes)):
    
    class_name = classes[ind]
    print("Class "+str(ind) + " : " + class_name)
    
    subprocess.run([ 'mkdir', run_mode+'/'+class_name])

    command = "grep "+dict_list[class_name.replace('_', ' ')] + " ./" + run_mode + "-annotations-bbox.csv"
    class_annotations = subprocess.run(command.split(), stdout=subprocess.PIPE).stdout.decode('utf-8')
    class_annotations = class_annotations.splitlines()

    for line in class_annotations:

        line_parts = line.split(',')
        
        #IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside
        if (args.occluded==0 and int(line_parts[8])>0):
            print("Skipped %s",line_parts[0])
            continue
        if (args.truncated==0 and int(line_parts[9])>0):
            print("Skipped %s",line_parts[0])
            continue
        if (args.groupOf==0 and int(line_parts[10])>0):
            print("Skipped %s",line_parts[0])
            continue
        if (args.depiction==0 and int(line_parts[11])>0):
            print("Skipped %s",line_parts[0])
            continue
        if (args.inside==0 and int(line_parts[12])>0):
            print("Skipped %s",line_parts[0])
            continue

        cnt = cnt + 1

        command = 'aws s3 --no-sign-request --only-show-errors cp s3://open-images-dataset/'+run_mode+'/'+line_parts[0]+'.jpg '+ 'JPEGImages'+'/'+class_name+'/'+line_parts[0]+'.jpg'
        commands.append(command)
        

        with open('labels/%s.txt'%(line_parts[0]),'a') as f:
            f.write(' '.join([str(ind), str((float(line_parts[5]) + float(line_parts[4]))/2), str((float(line_parts[7]) + float(line_parts[6]))/2), str(float(line_parts[5])-float(line_parts[4])), str(float(line_parts[7])-float(line_parts[6]))])+'\n')

print("Annotation Count : "+str(cnt))
commands = list(set(commands))
print("Number of images to be downloaded : "+str(len(commands)))

list(tqdm(pool.imap(os.system, commands), total = len(commands) ))

pool.close()
pool.join()	

 

发布了28 篇原创文章 · 获赞 34 · 访问量 2万+

Guess you like

Origin blog.csdn.net/sinat_35907936/article/details/88911770