[BEV perception] 3-BEV open source data set

1 KITTY

KITTY

insert image description here

insert image description here

1.1 How to collect KITTI data?

It is collected by sensors such as on-board cameras and lidar.

Only the annotation data within 90 degrees of the front view of the camera is provided.
insert image description here

1.2 What is the scale of KITTI data?

A total of 14999 images and their corresponding point clouds. 7481 of them are used as training set and 7518 are used as test set

1.3 What targets does KITTI mark?

The target categories include: cars, pedestrians and cyclists, a total of 80256 labeled objects

1.4 Transformation matrix

y = P r e c t ( i ) R r e c t ( 0 ) T v e l o c a m x y=P_{rect}^{(i)}R_{rect}^{(0)}T_{velo}^{cam}x y=Prect(i)Rrect(0)Tvelocamx

x x x : Indicates point cloud coordinates x, y, z. yyy : Indicates pixel coordinates,T velocam T_{velo}^{cam}Tvelocam: Indicates the conversion from the LiDar coordinate system to the camera coordinate system. R rect ( 0 ) R_{rect}^{(0)}Rrect(0): Indicates the camera distortion correction parameters. P rect ( i ) P_{rect}^{(i)}Prect(i): Indicates the internal camera reference, which converts the image from the camera coordinate system to the pixel coordinate system.

This transformation matrix can realize bidirectional transformation between point cloud coordinates and pixel coordinates.

1.5 Label file

Each txt file represents a scene, and the scene is numbered with six digits.

insert image description here
The content of the file is as follows, each line identifies an object.
insert image description here
insert image description here
Take the 000000.txt file and its corresponding image as an example.

Pedestrian 0.00 0 -0.20 712.40 143.00 810.73 307.92 1.89 0.48 1.20 1.84 1.47 8.41 0.01

parameter name parameter value parameter value type Parameter value range Remark
target category Pedestrian
Truncation degree 0.00 continuous value [0.0,1.0] Truncation means that only part of the target appears in the picture
degree of occlusion 0 integer, discrete value {0,1,2,3} 0 No occlusion, 1 Partial occlusion, 2 Severe occlusion, 3 Cannot identify whether it is occluded
viewing angle -0.20 radian value [-p,p] The angle between the current marked object and the camera
2d label 712.40 143.00 810.73 307.92 coordinate value The coordinate position of the 2d target frame, the first two values ​​represent the coordinates of the upper left corner, and the last two values ​​represent the coordinates of the lower right corner
3d label 1.89 0.48 1.20 1.84 1.47 8.41 coordinate value Unit: m The first three values ​​represent height h width w length l, and the last three values ​​represent the position coordinates of the center point, which form 7 3D target detection quantities with the observation angle
Confidence 0.01 probability value Indicates the probability that the current target exists in this position and category, usually the value used after network prediction during testing

insert image description here

2 nuScenes

nuScenes

6 cameras, roof LiDAR, 5 millimeter wave radars. Image data, point cloud data, target annotation (transition matrix) can be provided.
insert image description here
insert image description here

2.1 nuScenes Vs KITTI

insert image description here

2.2 Annotate files

insert image description here
| -nuSecenes
| - | - maps: Used for subsequent planning tasks, not used in target detection.
| - | - samples: extracted keyframes, marked.
| - | - sweeps: The remaining frames that were not extracted as keyframes, not labeled.
| - | - v1.0-*: json annotation file, * means train, val, test, mini different folders.
| - | - | - attribute.json: describes the attributes of an instance.
| - | - | - calibrated_sensorjson: The calibration data of the specific sensor (lidar/radar/camera) that has been calibrated on the vehicle is actually a transformation matrix.
| - | - | - category.json: Classification of object categories.
| - | - | - ego_pose.json: A pose of the vehicle at a specific moment.
| - | - | - instancejson: An instance of an object.
| - | - | - log.json: The information of the log for extracting data.
| - | - | - map.json: The map data port saved by the binary segmentation mask, which is not commonly used in target detection in planning.
| - | - | - sample.json: sample. Indicate which frames are keyframes.
| - | - | - sample_annotation.json: 3d bounding box, target information in keyframes.
| - | - | - sample_data.json: All sensor data. Frames other than keyframes.
| - | - | - scene.json: scene data
|-|-|- sensor.json: the kind of sensor
|-|-|- visibility.json: the visibility of the instance

insert image description here

Guess you like

Origin blog.csdn.net/guai7guai11/article/details/132090269