3-BEV Open Source Dataset
1 KITTY
1.1 How to collect KITTI data?
It is collected by sensors such as on-board cameras and lidar.
Only the annotation data within 90 degrees of the front view of the camera is provided.
1.2 What is the scale of KITTI data?
A total of 14999 images and their corresponding point clouds. 7481 of them are used as training set and 7518 are used as test set
1.3 What targets does KITTI mark?
The target categories include: cars, pedestrians and cyclists, a total of 80256 labeled objects
1.4 Transformation matrix
y = P r e c t ( i ) R r e c t ( 0 ) T v e l o c a m x y=P_{rect}^{(i)}R_{rect}^{(0)}T_{velo}^{cam}x y=Prect(i)Rrect(0)Tvelocamx
x x x : Indicates point cloud coordinates x, y, z. yyy : Indicates pixel coordinates,T velocam T_{velo}^{cam}Tvelocam: Indicates the conversion from the LiDar coordinate system to the camera coordinate system. R rect ( 0 ) R_{rect}^{(0)}Rrect(0): Indicates the camera distortion correction parameters. P rect ( i ) P_{rect}^{(i)}Prect(i): Indicates the internal camera reference, which converts the image from the camera coordinate system to the pixel coordinate system.
This transformation matrix can realize bidirectional transformation between point cloud coordinates and pixel coordinates.
1.5 Label file
Each txt file represents a scene, and the scene is numbered with six digits.
The content of the file is as follows, each line identifies an object.
Take the 000000.txt file and its corresponding image as an example.
Pedestrian 0.00 0 -0.20 712.40 143.00 810.73 307.92 1.89 0.48 1.20 1.84 1.47 8.41 0.01
parameter name | parameter value | parameter value type | Parameter value range | Remark |
---|---|---|---|---|
target category | Pedestrian | |||
Truncation degree | 0.00 | continuous value | [0.0,1.0] | Truncation means that only part of the target appears in the picture |
degree of occlusion | 0 | integer, discrete value | {0,1,2,3} | 0 No occlusion, 1 Partial occlusion, 2 Severe occlusion, 3 Cannot identify whether it is occluded |
viewing angle | -0.20 | radian value | [-p,p] | The angle between the current marked object and the camera |
2d label | 712.40 143.00 810.73 307.92 | coordinate value | The coordinate position of the 2d target frame, the first two values represent the coordinates of the upper left corner, and the last two values represent the coordinates of the lower right corner | |
3d label | 1.89 0.48 1.20 1.84 1.47 8.41 | coordinate value | Unit: m | The first three values represent height h width w length l, and the last three values represent the position coordinates of the center point, which form 7 3D target detection quantities with the observation angle |
Confidence | 0.01 | probability value | Indicates the probability that the current target exists in this position and category, usually the value used after network prediction during testing |
2 nuScenes
6 cameras, roof LiDAR, 5 millimeter wave radars. Image data, point cloud data, target annotation (transition matrix) can be provided.
2.1 nuScenes Vs KITTI
2.2 Annotate files
| -nuSecenes
| - | - maps: Used for subsequent planning tasks, not used in target detection.
| - | - samples: extracted keyframes, marked.
| - | - sweeps: The remaining frames that were not extracted as keyframes, not labeled.
| - | - v1.0-*: json annotation file, * means train, val, test, mini different folders.
| - | - | - attribute.json: describes the attributes of an instance.
| - | - | - calibrated_sensorjson: The calibration data of the specific sensor (lidar/radar/camera) that has been calibrated on the vehicle is actually a transformation matrix.
| - | - | - category.json: Classification of object categories.
| - | - | - ego_pose.json: A pose of the vehicle at a specific moment.
| - | - | - instancejson: An instance of an object.
| - | - | - log.json: The information of the log for extracting data.
| - | - | - map.json: The map data port saved by the binary segmentation mask, which is not commonly used in target detection in planning.
| - | - | - sample.json: sample. Indicate which frames are keyframes.
| - | - | - sample_annotation.json: 3d bounding box, target information in keyframes.
| - | - | - sample_data.json: All sensor data. Frames other than keyframes.
| - | - | - scene.json: scene data
|-|-|- sensor.json: the kind of sensor
|-|-|- visibility.json: the visibility of the instance