yolov7 training BDD100k autonomous driving environment perception 2D box detection model

Data set selection

There are many data sets related to autonomous driving, and what is needed here is target detection. The more commonly used data sets are: KITTI, nuScenes, bdd100k, COCO, etc. I chose nuScenes at the beginning, and found some problems after training. For example, the data volume of various types of data in the data set is unbalanced. Some categories are easy to train, such as cars. The important label type traffic lights cannot meet the needs of use. Considering that yolov7's GT is a 2D frame, the 3D frame given in the data set needs to be converted, and there is also the possibility of loss of accuracy, so finally select in the data set with calibrated 2D frames, bdd100k data volume, label type, data distribution They all meet the requirements, so the bdd100k data set was finally selected.

bdd100k dataset introduction and download

Official website: https://bair.berkeley.edu/blog/2018/05/30/bdd/

Paper: https://arxiv.org/pdf/1805.04687.pdf

Dataset: https://bdd-data.berkeley.edu/

insert image description here

Recently, the Berkeley AI Lab published the largest and most diverse open source video dataset in the CV field – the BDD100K dataset. The data set consists of 100,000 videos, each video is about 40 seconds, 720P, 30fps, and the total time is more than 1,100 hours. The video sequence also includes GPS location, IMU data and time stamp; the video has GPS/IMU information recorded by the mobile phone , to show rough driving trajectories. These videos were collected from different places in the United States. As shown in the figure above, the database covers different weather conditions, including sunny, cloudy and rainy days, as well as different days and nights. time. The figure below shows the comparison between the BDD100K dataset and the current mainstream dataset.
insert image description here

The data set extracts key frames at the 10th second of each video and annotates them. It is mainly divided into the following levels, image marking, road object borders, drivable areas, lane markings, and full-frame instance segmentation.

insert image description here

This paper focuses on its labeling for road object detection. The BDD100K dataset marks BoundingBox for common objects on the road in 100,000 keyframe images to understand the distribution of objects and their locations. The bar graph below shows the number of various targets.

insert image description here

Dataset download:

Visit the BDD dataset website https://bdd-data.berkeley.edu/, register an account, click download, and download the required data as needed. Choose 100K images and labels here.

insert image description here

Label format conversion

See the attachment for the code.

The label format of the BDD100k dataset is shown in the figure below, which is the JSON format generated by Scalabel. We use the COCO dataset label format as middleware for conversion, that is, first convert the BDD label format to the COCO format, and then convert the COCO format to the YOLO format.

insert image description here

BDD to COCO

The core code is as follows.

for i in tqdm(labeled_images):
    counter += 1
    image = dict()
    image['file_name'] = i['name']
    image['height'] = 720
    image['width'] = 1280

    image['id'] = counter

    empty_image = True

    for label in i['labels']:
        annotation = dict()
        category=label['category']
        if (category == "traffic light"):
            color = label['attributes']['trafficLightColor']
            category = "tl_" + color
        if category in id_dict.keys():
            empty_image = False
            annotation["iscrowd"] = 0
            annotation["image_id"] = image['id']
            x1 = label['box2d']['x1']
            y1 = label['box2d']['y1']
            x2 = label['box2d']['x2']
            y2 = label['box2d']['y2']
            annotation['bbox'] = [x1, y1, x2-x1, y2-y1]
            annotation['area'] = float((x2 - x1) * (y2 - y1))
            annotation['category_id'] = id_dict[category]
            annotation['ignore'] = 0
            annotation['id'] = label['id']
            annotation['segmentation'] = [[x1, y1, x1, y2, x2, y2, x2, y1]]
            annotations.append(annotation)

    if empty_image:
        continue

    images.append(image)

COCO to YOLO

Modify the config in coco2yolo.py and run it.

if __name__ == '__main__':

    config ={
    
    
        "datasets": "COCO",
        "img_path": "E:/CODE/deyi/dataset/bdd100k/iamges/100k/train",
        "label": "E:/CODE/deyi/dataset/bdd100k/labels_coco/bdd100k_labels_images_det_coco_train.json",
        "img_type": ".jpg",
        "manipast_path": "./",
        "output_path": "E:/CODE/deyi/dataset/bdd100k/labels_yolo/train",
        "cls_list": "data/bdd100k.names",
    }
#     config ={
    
    
#         "datasets": "COCO",
#         "img_path": "E:/CODE/deyi/dataset/bdd100k/iamges/100k/val",
#         "label": "E:/CODE/deyi/dataset/bdd100k/labels_coco/bdd100k_labels_images_det_coco_val.json",
#         "img_type": ".jpg",
#         "manipast_path": "./",
#         "output_path": "E:/CODE/deyi/dataset/bdd100k/labels_yolo/val",
#         "cls_list": "data/bdd100k.names",
#     }

    main(config)

The overall file structure and the code used are shown in the figure below.

insert image description here
insert image description here

After the format conversion is completed, first use check_label_image.py to check whether the image and the label are consistent, and then use exportfiletxt.py to export the image path of the training set and verification set to the corresponding txt file, and use the yolo format label and the image of the data set according to the requirements of yolov7 format into the training folder.
insert image description here

The data storage format of YOLOv7 is the same as that of YOLOv5. Put all images in the images folder, store all labels.txt files in the labels folder at the same level as images, and use the index file exported by the aforementioned exportfiletxt.py for indexing. The cache file in the figure below is generated during training, and the rest of the files and folder structure are shown in the figure.
insert image description here

Refer to the ./data/cooc.yaml file to write bdd100k.yaml

insert image description here

Select the model type you want to use under the ./cfg/training folder, and modify its content, for example, yolov7.yaml, modify its nc = 13. This is mainly to modify the number of categories nc of the label, and the model structure can be further modified if necessary.
insert image description here

Finally, modify the parameters in train.py to run, or use the corresponding run command.

    parser = argparse.ArgumentParser()
    parser.add_argument('--weights', type=str, default='', help='initial weights path')
    parser.add_argument('--cfg', type=str, default='cfg/training/yolov7.yaml', help='model.yaml path')
    parser.add_argument('--data', type=str, default='data/bdd100k.yaml', help='data.yaml path')
    parser.add_argument('--hyp', type=str, default='data/hyp.scratch.p5.yaml', help='hyperparameters path')
    parser.add_argument('--epochs', type=int, default=300)
    parser.add_argument('--batch-size', type=int, default=4, help='total batch size for all GPUs')
    parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
    parser.add_argument('--rect', action='store_true', help='rectangular training')
    parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
    parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
    parser.add_argument('--notest', action='store_true', help='only test final epoch')
    parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
    parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
    parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
    parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
    parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
    parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
    parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')
    parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
    parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
    parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
    parser.add_argument('--workers', type=int, default=0, help='maximum number of dataloader workers')
    parser.add_argument('--project', default='runs/train', help='save to project/name')
    parser.add_argument('--entity', default=None, help='W&B entity')
    parser.add_argument('--name', default='bdd100k', help='save to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--quad', action='store_true', help='quad dataloader')
    parser.add_argument('--linear-lr', action='store_true', help='linear LR')
    parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')
    parser.add_argument('--upload_dataset', action='store_true', help='Upload dataset as W&B artifact table')
    parser.add_argument('--bbox_interval', type=int, default=-1, help='Set bounding-box image logging interval for W&B')
    parser.add_argument('--save_period', type=int, default=-1, help='Log model after every "save_period" epoch')
    parser.add_argument('--artifact_alias', type=str, default="latest", help='version of dataset artifact to be used')
    opt = parser.parse_args()

Run the command: Enter the following command on the command line.

python train.py --workers 0 --device 0 --batch-size 4 --data data/bdd100k.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml

reference link

This article corresponds to the code download address: bdd100k data set label to COO and then to YOLO program

[1] Dry goods | YOLOV5 training automatic driving data set, and transfer to Tensorrt, collection!

[2] This article completely solves the problem that YOLOv5 training cannot find labels

[3] yolov4, yolov5 training nuscenes dataset/nuscenes dataset to coco format

[4] https://github.com/williamhyin/yolov5s_bdd100k

Guess you like

Origin blog.csdn.net/qq_37214693/article/details/126708738