Faster-ILOD, maskrcnn_benchmark training own voc data set and problem summary

1. Train your own labeled VOC dataset

Use Faster-ILOD to train your own labeled data set. The data set has been converted to VOC format. You
can successfully train the VOC2007 data set before. Please refer to
the installation process of Faster-ILOD and maskrcnn_benchmark and encounter problems.
Mainly modify the following files:

1.maskrcnn_benchmark/config/paths_catalog.py

Change the corresponding path to the path where your own data set is located

        "voc_2007_train": {
    
    
            "data_dir": "/data/taxi_data/VOC2007_all",
            "split": "train"
        },
        "voc_2007_val": {
    
    
            "data_dir": "/data/taxi_data/VOC2007_all",
            "split": "val"
        },
        "voc_2007_test": {
    
    
            "data_dir": "/data/taxi_data/VOC2007_all",
            "split": "test"
        },

2.maskrcnn_benchmark/data/voc.py

Modify to include categories in the data set. I mainly identify taxis and private cars.

    # CLASSES = ("__background__ ", "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow",
    #            "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor")
    CLASSES = ("__background__ ", "Taxi","Other Car")

3. Modify configs/e2e_faster_rcnn_R_50_C4_1x.yaml

Modify the number of categories, etc., I trained both categories at once.

NUM_CLASSES: 3 # total classes 
NAME_OLD_CLASSES: []
#NAME_NEW_CLASSES: ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog",
# "horse", "motorbike", "person","pottedplant", "sheep", "sofa", "train","tvmonitor"]
#    NAME_EXCLUDED_CLASSES: [ ]
NAME_NEW_CLASSES: ["Taxi","Other Car"]
NAME_EXCLUDED_CLASSES: []

4. run

python tools/train_first_step.py --config-file="./configs/e2e_faster_rcnn_R_50_C4_1x.yaml"

Two, encounter problems

1. There is no corresponding Taxi_train.txt and other files

Because it is doing incremental learning, it will be processed for each category. The original downloaded voc file contains txt files such as train and test corresponding to each category, but when converting the self-labeled dataset to the dataset in voc format Convert to train.txt, test.txt, trainval.txt and val.txt. There is no corresponding txt file, so a corresponding file should be generated.
insert image description here
Observe the txt file corresponding to each category in the metadata set. Each row corresponds to two columns. One column is the name of the picture, and the other column is a digit. The value should be 1.-1.0. -1 means that the category does not exist in this picture
.
In this picture,
0 is a difficult sample
(I personally think so, please correct me if it is wrong).
insert image description here
According to the xml file and train.txt, test.txt, val.txt, the conversion of the corresponding category is performed. The code is as follows:
When I generate the corresponding file, I only write the name of the picture that exists in this category. When I report an error when reading the data, I change the code. You can modify the code to keep it consistent with the original data set.

names = locals()
dir_path = "D:/achenf/data/jiaotong_data/1taxi_train_data/VOC/VOC2007/"
# 读取train.txt/test.txt/val.txt文件
f = open("D:/achenf/data/jiaotong_data/1taxi_train_data/VOC/VOC2007/ImageSets/Main/train.txt", 'r')
# 包含类别
classes = ['Taxi','Other Car']
# 写入文档名称
txt_path = dir_path+"by_classes/"+"Other Car_train.txt"
fp_w = open(txt_path, 'w')
for line in f.readlines():
    if not line:
        break
    n= line[:-1]
    xmlpath=dir_path+"Annotations/"+n+'.xml'
    fp = open(xmlpath)

    xmllines = fp.readlines()
    ind_start = []
    ind_end = []
    lines_id_start = xmllines[:]
    lines_id_end = xmllines[:]
    # 修改对应类别Other Car/Taxi
    classes1 = '    <name>Other Car</name>\n'
    while "  <object>\n" in lines_id_start:
        a = lines_id_start.index("  <object>\n")
        ind_start.append(a)
        lines_id_start[a] = "delete"

    while "  </object>\n" in lines_id_end:
        b = lines_id_end.index("  </object>\n")
        ind_end.append(b)
        lines_id_end[b] = "delete"

    # names中存放所有的object块
    i = 0
    for k in range(0, len(ind_start)):
        names['block%d' % k] = []
        for j in range(0, len(classes)):
            if classes[j] in xmllines[ind_start[i] + 1]:
                a = ind_start[i]
                for o in range(ind_end[i] - ind_start[i] + 1):
                    names['block%d' % k].append(xmllines[a + o])
                break
        i += 1
    # 在给定的类中搜索,若存在则,写入object块信息
    a = 0
    if len(ind_start)>0:
        # xml头
        string_start = xmllines[0:ind_start[0]]
        # xml尾
        string_end = [xmllines[len(xmllines) - 1]]
        flag = False
        for k in range(0, len(ind_start)):
            if classes1 in names['block%d' % k]:
                flag = True
                a += 1
                string_start += names['block%d' % k]
        string_start += string_end
        # 如果存在写入
        if flag:
            fp_w.write(n+" "+"1"+'\n')
    fp.close()
fp_w.close()

to get the following files:
insert image description here

2.x[2] == '0' is out of subscript

When splitting each line, if the value is 1, the txt file in the source data set is decomposed into ['000000', '', '1'], and the file I generated is decomposed into ['000000', '1'], because there is no Difficult samples, directly comment these lines

    for i in range(len(buff)):
        x = buff[i]
        x = x.split(' ')
        if x[1] == '-1':
            pass
        # elif x[2] == '0':  # include difficult level object
        #     if self.is_train:
        #         pass
        #     else:
        #         img_ids_per_category.append(x[0])
        #         self.ids.append(x[0])
        else:
            img_ids_per_category.append(x[0])
            self.ids.append(x[0])

And there is no difficult tag in my xml file, so difficult = int(obj.find("difficult").text) == 1I will report an error when encountering this line of code, so directly assign difficult to False

for obj in target.iter("object"):

     difficult = False
     # difficult = int(obj.find("difficult").text) == 1
     if not self.keep_difficult and difficult:
         continue

3.ValueError: invalid literal for int() with base 10: ‘0.0’

The error reported in this line bndbox = tuple(map(lambda x: x - TO_REMOVE, list(map(int, box))))should be the problem of directly converting str to int, so we first convert it to float, and then convert it to int, and modify it as follows: it is bndbox = tuple(map(lambda x: x - TO_REMOVE, list(map(int, map(float,box)))))
ok
to learn a new knowledge point, and the map() function can realize batch Data type conversion.

You can refer to: Use map() in Python3 to convert data types in batches, such as str to float

4. Pytorch reports CUDA error: device-side assert triggered

Reference: https://blog.csdn.net/veritasalice/article/details/111917185

Guess you like

Origin blog.csdn.net/chenfang0529/article/details/125085896