LiDAR data using KITTI dataset (data preprocessing + dataset production + training)

Table of contents

1 Introduction

2. Dataset Introduction

2.1 Collection area

2.2 Acquisition Platform

3. LiDAR data location

4. The meaning of lidar data label

5. Data preprocessing and training

5.1 configure openpcdet

5.2 Data preprocessing

5.2.1 Dataset catalog arrangement

5.2.2 Dataset Format Conversion

5.3 Training


1 Introduction

LiDAR perception-related work is inseparable from data sets. LiDAR data is marked at a high price, and there are not many open source data sets to choose from. Due to the large number of sensors involved, clarifying the context of the KITTI data set is helpful for the use of this data set. . This article integrates the information of the dataset, including label meanings, training cases, etc., and continues to update it.

2. Dataset Introduction

The KITTI data set is jointly established by the Karlsruhe Institute of Technology in Germany and the Toyota American Institute of Technology. It is a public data set obtained by collecting data from actual traffic scenes using assembled and well-equipped collection vehicles. The data set contains rich and diverse sensor data (including binocular cameras, 64-line lidar, GPS/IMU integrated navigation and positioning system, which basically meets the needs for image, point cloud and positioning data), a large number of calibration truth values ​​(including detection 2D and 3D bounding boxes, tracking tracklets) and some official development tools, etc.

2.1 Collection area

The data is collected from Karlsruhe, Germany, and the schematic diagram of the collection area is as follows:

KITTI dataset collection range

2.2 Acquisition Platform

The schematic diagram of the collection platform is as follows:

Please add a picture description

Please add a picture description

It can be seen from the figure that gps/imu and lidar are based on the camera coordinate system, so it is not difficult to understand why the XYZ of the target in the following annotation file is based on the camera coordinate system.

3. LiDAR data location

The data is stored in the velodyne folder (bin format) under the training and testing directories, and the label files are stored in the label_2 folder under the training and testing directories. The calibration file is stored under the calib file, because the data of other sensors such as lidar and camera have a coordinate transformation relationship. The data stored in other folders may not be used (if only lidar data is used).

Tips : If you train your own kitti format data set, sometimes the point cloud data is stored in pcd format, and the pcd to bin file code refers to the previous blog post

(131 messages) Point cloud in pcd format to point cloud in bin format

4. The meaning of lidar data label

The label file is stored in   the dataset directory/training/label_2   folder, and the label information is stored in txt format. Open the first label file, and the content is as follows.

Column 1 (string): There are 9 categories representing the object category (type), namely: Car, Van, Truck, Pedestrian, Person_sitting, Cyclist, Tram, Misc, DontCare. The DontCare label indicates that the area has not been marked, for example, because the target object is too far away from the lidar. In order to prevent false positives (false positives) from areas that are originally target objects but not marked for some reason during the evaluation process (mainly calculating precision), the evaluation script will automatically ignore the prediction results of the DontCare area.

The second column (floating point number): represents whether the object is truncated (truncated), the value floats between 0 (non-truncated) and 1 (truncated), and the number indicates the degree of departure from the image boundary object.

The third column (integer): indicates whether the object is occluded, and the integers 0, 1, 2, and 3 indicate the degree of occlusion respectively.

Column 4 (number of radians): The range of the observation angle (alpha) of the object is: -pi ~ pi (unit: rad), which represents the distance from the camera origin to the center of the object in the camera coordinate system, with the camera origin as the center The connecting line is the radius. Rotate the object around the y-axis of the camera to the z-axis of the camera. At this time, the angle between the direction of the object and the x-axis of the camera is shown in the figure.

Columns 5~8 (floating point numbers): The size of the 2D bounding box (bbox) of the object. The four numbers are xmin, ymin, xmax, and ymax (unit: pixel), indicating the coordinates of the upper left corner and lower right corner of the 2D bounding box .

Columns 9~11 (floating point numbers): the dimensions of the 3D object are height, width, and length (unit: meter).

Columns 12-14 (integer): The location of the 3D object is x, y, and z (unit: meter). Note that xyz here is the center point of the 3D object in the camera coordinate system.

Column 15 (number of radians): The spatial direction of the 3D object (rotation_y) ranges from -pi to pi (unit: rad), which indicates that, in the camera coordinate system, the global direction angle of the object (the object’s forward direction and The included angle of the x-axis of the camera coordinate system), as shown in the figure above.

The 16th column (integer): the confidence level of the detection (score). When I saw it for the first time, I found that there were only 15 columns in the data. There is no such number in the training set, and it should only be in the test set. After all, the confidence level cannot be judged manually when labeling .

5. Data preprocessing and training

5.1 configure openpcdet

This article uses the openpcdet framework as a case, which is an open source library that integrates many 3D detection algorithms.

#下载openpcdet库
git clone https://github.com/open-mmlab/OpenPCDet.git

#安装环境
python setup.py develop

5.2 Data preprocessing

5.2.1 Dataset catalog arrangement

Organize the data set according to the following format.

OpenPCDet
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes) & (optional: depth_2)
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2
├── pcdet
├── tools

5.2.2 Dataset Format Conversion

The data set cannot be used directly. openpcdet requires the data to be converted to a fixed format first, so execute the following instructions first.

python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml

5.3 Training

First pick the model you want in the openpcdet folder.

#执行训练,后面的xxx.yaml为模型的配置文件
python train.py --cfg_file cfgs/kitti_models/xxx.yaml

Guess you like

Origin blog.csdn.net/weixin_52514564/article/details/129203689