Summary of Autonomous Driving Datasets

1.Nuscenes

Dataset link: nuScenes

There are multiple tasks under the nuscenes dataset, involving multiple tasks such as Detection(2D/3D), Tracking, prediction, 激光雷达分割, 全景任务, and so on;规划控制

The nuScenes data set is a 三维目标注释large-scale autonomous driving data set, and it is also a benchmark for mainstream algorithm evaluation. Its characteristics are:

● Full sensor suite (1 lidar, 5 radars, 6 cameras, IMU, GPS)

● 1000 scenes of 20s

● 1,400,000 camera images

● 390,000 lidar scans

● Two different cities: Boston and Singapore

● left traffic vs right traffic

● Detailed map information

● 1.4M 3D bounding boxes manually annotated for 23 object classes
insert image description here

2. KITTY

Dataset official website: The KITTI Vision Benchmark Suite (cvlibs.net)

The ITTI data set was jointly established by the Karlsruhe Institute of Technology in Germany and the Toyota American Institute of Technology. The data set is used to evaluate the performance of computer vision technologies such as 立体视觉(stereo), 光流(optical flow), 视觉测距(visual odometry), 3D物体检测(object detection)and 3D跟踪(tracking), etc. in the vehicle environment. KITTI contains real image data collected from scenes such as urban areas, rural areas, and highways. There are up to 15 vehicles and 30 pedestrians in each image, as well as various degrees of occlusion and truncation. The entire dataset consists of 389 pairs of stereo images and optical flow maps, a 39.2 km visual odometry sequence, and images of over 200k 3D labeled objects, sampled and synchronized at a frequency of 10Hz. Overall, the original dataset is categorized as 'Road', 'City', 'Residential', 'Campus' and 'Person'. For 3D object detection, the label is subdivided into car, van, truck, pedestrian, pedestrian(sitting), cyclist, tram and misc.

因为数据量少,目前很多算法验证都在nuscenes上啦~~~
insert image description here
insert image description here

3. Wamyo

  • Year: 2020;

  • Author: Waymo LLC and Google LLC

  • Number of scenes: 1150 scenes in total, mainly collected from San Francisco, Mountain View, Phoenix, etc.;

  • Number of categories: 4 categories in total, namely Vehicles, Pedestrians, Cyclists and Signs;

  • Whether 360° collection: yes;

  • Total data: 2030 segments in total, each segment is 20 seconds long;

  • Total number of annotations: about 12,600,000 3D annotation boxes;

  • Sensor model: including 1 mid-range LiDAR, 4 short-range LiDARs, 5 cameras (front and side), and LiDAR and camera are synchronized and calibrated;

  • Dataset link: https://waymo.com/open/ ;

Introduction: Waymo is one of the most important data sets in the field of autonomous driving, with a large scale, mainly used to support the research of autonomous driving perception technology. Waymo mainly consists of two datasets, Perception Dataset and Motion Dataset. Among them, Perception Dataset includes 3D annotations, 2D panoramic segmentation annotations, key point annotations, 3D semantic segmentation annotations, etc. The Motion Dataset is mainly used for the research of interactive tasks. It contains a total of 103,354 20s clips, marking different objects and corresponding 3D map data.
insert image description here

4.BDD100K

The BDD100K dataset was released in May 2018 by the University of Berkeley AI Laboratory (BAIR), and a picture annotation system was designed at the same time. The BDD100K data set contains 10万段高清视频, each video is about 40 seconds/720p/30 fps. The key frame is sampled in the 10th second of each video, and 100,000 pictures are obtained, and the picture resolution is 1280*720, and they are marked. The database set contains 不同天气、场景、时间的图片, has the characteristics of large scale and diversity.

Main tasks: 视频, 可行使区域, 车道线, 语义分割, 实力分割, 全景分割, MOT, 检测任务, Poseetc.;

Dataset link: Berkeley DeepDrive

insert image description here

5. Lyft L5 Dataset

  • Year: 2019;

  • Author: Woven Planet Holdings;

  • Number of scenes: a total of 1805 scenes, outdoor;

  • Number of categories: 9 categories in total, including Car, Pedestrian, traffic lights, etc.;

  • Whether 360° collection: yes;

  • Total amount of data: including 46,000image data, and 对应的点云数据;

  • Total number of labels: approx 1300,000个3D标注框.

  • Sensor model: including 2 LiDARs, 40 lines and 64 lines respectively, installed on the roof and bumper, with a resolution of 0.2°, collecting about 216,000 points at 10Hz. In addition, it also includes 6 360° cameras and 1 telephoto camera, and the acquisition frequency of the camera is consistent with that of LiDAR.

  • Dataset link: https://level-5.global/data/ ;

Introduction: Lyft L5 is a complete set of L5 autonomous driving datasets, which is said to be "the industry's largest autonomous driving public dataset", covering Prediction Dataset and Perception Dataset. Among them, the Prediction Dataset covers various targets encountered by the autonomous driving test team along the Palo Alto route, such as Cars, Cyclists and Pedestrians. The Perception Dataset covers the real data collected by the LiDARs and cameras of the self-driving fleet, and manually labels a large number of 3D bounding boxes.

6. H3D data set

  • Year: 2019;

  • Author: Honda Research Institute;

  • Number of scenes: a total of 160 scenes, outdoor;

  • Number of categories: 8 categories in total;

  • Whether 360° collection: No;

  • Amount of data: 包括27,000张图像数据,及其对应的点云数据;

  • Total number of annotations: about 1100,000 3D annotation boxes;

  • Sensor model: equipped with a total of 3 cameras, the model is Grasshopper 3, and the resolution is the same. 1920x1200Except for the camera on the back FOV为80°, the FOV of the other 2 cameras is 90°, using a 64-line LiDAR, the model is Velodyne HDL64E S2, and One GNSS+IMU model is ADMA-G;

Dataset link: http://usa.honda-ri.com/H3D ;

Introduction: Honda Research Institute released its unmanned driving direction data set H3D in March 2019. The dataset includes 3D multi-object detection and tracking data collected using a 3D LiDAR scanner, and contains 160 congested and highly interactive traffic scenes with over 1 million labeled instances in 27,721 frames.

Main tasks include:
insert image description here

7. ApplloScape dataset

  • Year: 2019;

  • Author: Baidu Research;

  • Number of scenes: a total of 103 scenes, outdoor;

  • Number of categories: 共26类, including small vehicles, big vehicles, pedestrian, motorcyclist, etc.;

  • Whether 360° collection: No;

  • Amount of data: 包括143,906张图像数据,及其对应的点云数据;

  • Total number of labels: The total number of labels is unknown;

Sensor model: A total of 2 VUX-1HA laser scanners, 6 VMX-CS6 cameras (two front cameras with a resolution of 3384x2710), and an IMU/GNSS device are configured; laser scanners use two laser beams to scan their The surrounding environment, compared with the commonly used Velodyne HDL64E, the scanner can obtain a higher density point cloud with higher accuracy (5mm/3mm);

Dataset link: http://apolloscape.auto/index.html ;

Introduction: ApolloScape consists of RGB videos and corresponding dense point clouds. Contains more than 140K pictures, and each picture has pixel-level semantic information. The data collected in China, so compared with some foreign data sets, ApolloScape数据集包含的交通场景较复杂,各类目标数量较多,且与KITTI数据集类似,同样包含Easy,Moderate,Hard三个子集.

Key tasks include: 车道线、定位、轨迹预测、检测、跟踪、双目、场景识别等;

insert image description here

8. Argoverse dataset

  • Year: 2019;

  • Author: Argo AI, etc.;

  • Number of scenes: 113 scenes in total, outdoor, including USA, Pennsylvania, Miami, Florida, etc.;

  • Number of categories: 15 categories in total, including Vehicle, Pedestrian, Stroller, Animal, etc.;

  • Whether 360° collection: yes;

  • Total data: including 44,000 image data and corresponding point cloud data;

  • Total number of labels

  • Sensor model: Similar to KITTI and nuScenes, the Argoverse dataset is configured with two 32-line LiDAR sensors, model VLP-32. At the same time, including 7 high-resolution surround-view cameras with a resolution of 1920x1200 and 2 front-facing cameras with a resolution of 2056x2464;

Dataset link: https://www.argoverse.org/ ;

Main tasks: 3D跟踪、运动预测waiting for tasks

What it is: The data in Argoverse comes from a subset of areas where Argo AI's self-driving test vehicles operate in Miami and Pittsburgh, two U.S. cities with different urban driving challenges and local driving habits. Includes recordings of sensor data or "log segments" across different seasons, weather conditions and times of day to provide a wide range of real-world driving scenarios. It contains 3D tracking annotations of a total of 113 scenes, each segment is 15-30 seconds long, and contains a total of 11052 tracking targets. Among them, 70% of the marked objects are vehicles, and the rest are pedestrians, bicycles, motorcycles, etc. In addition, Argoverse contains high-definition map data, mainly including 290 kilometers of lane maps in Pittsburgh and Miami, such as location, connection, traffic signal, elevation, etc. information.

insert image description here

9. Argoversev2 dataset

Argoverse 2 is a collection of open-source autonomous driving data and high-definition (HD) maps from six U.S. cities: Austin, Detroit, Miami, Pittsburgh, Palo Alto, and Washington, DC. This release builds on the debut of the Argovverse ("Argoverse1"), one of the first data releases to include high-definition maps for machine learning and computer vision research.

Argoverse 2 includes four open source datasets:

Argoverse 2 Sensor Dataset: Contains 1000 3D annotated scenes with lidar, stereo and ring camera images. This dataset improves upon the Argoverse 1 3D tracking dataset;

Argoverse 2 Motion Prediction Dataset: Contains 250,000 scenes with trajectory data for many object types. This dataset improves upon the Argoverse 1 motion prediction dataset;

Argoverse 2 lidar dataset: contains 20,000 unlabeled lidar sequences;

Argoverse 2 Map Changes Dataset: Contains 1000 scenes, 200 of which describe real-world HD map changes!

The Argoverse 2 datasets share a common HD map format that is richer than the HD maps in Argoverse 1. The Argoverse 2 datasets also share a common API that allows users to easily access and visualize data and maps.
insert image description here
insert image description here

10. Occ3D

Produced by Tsinghua University and NVIDIA, the first large-scale occupancy grid benchmark!

Dataset link: Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving (tsinghua-mars-lab.github.io)

The authors generated two 3D occupancy prediction datasets, Occ3D nuScenes and Occ3D Waymo. Occ3D nuScenes contains 600 scenes for training, 150 scenes for validation and 150 scenes for testing, totaling 40000 frames. It has 16 public classes and an additional Generic Object (GO) class. Each sample covers a range of [-40m, -40m, -1m, 40m, 40m, 5.4m] and a voxel size of [0.4m, 0.4m, 0.4m]. Occ3D Waymo contains 798 sequences for training and 202 for validation, accumulating 200,000 frames. It has 14 known object classes and an additional GO class. Each sample covers a range of [-80m, -80m, -1m, 80m, 80m, and 5.4m] with a very fine voxel size of [0.05m, 0.05m, 0.05m].

insert image description here

11.nuPlan

nuPlan is the world's first large-scale planning benchmark for autonomous driving, although there is a growing body of ML-based motion planners, the lack of established datasets, simulation frameworks, and metrics has limited progress in the field. Existing benchmarks for motion prediction for autonomous vehicles (Argovest, Lyft, Waymo) focus on short-term motion prediction of other agents rather than long-term planning for the ego car. This leads to previous work using L2-based metrics for open-loop evaluation, which is not suitable for fair evaluation of long-term planning. This benchmark overcomes these limitations by providing a training framework to develop machine learning-based planners, a lightweight closed-loop simulator, metrics specific to motion planning, and an interactive tool to visualize results.

Provides a large-scale dataset containing 1200 hours of human driving data from 4 cities in the US and Asia (Boston, Pittsburgh, Las Vegas, and Singapore). Datasets are automatically labeled using a state-of-the-art Offline Perception system. In contrast to existing datasets of this size, not only 3D boxes of objects detected in the dataset are released, but also 10% of the raw sensor data (120h).

Dataset link: nuPlan (nuscenes.org)

insert image description here

12.ONCE (One Million Scenes)

● Publisher: Huawei

● Release date: 2021

● Introduction: ONCE (One millioN sCenEs) is a 3D object detection dataset in autonomous driving scenarios. The ONCE dataset consists of 1 million LiDAR scenes and 7 million corresponding camera images. The data was drawn from 144 driving hours, 20 times longer than other available 3D autonomous driving datasets such as nuScenes and Waymo, and was collected over a range of different regions, time periods and weather conditions. Consists of: 1 million LiDAR frames, 7 million camera images 200 km2 driving area, 144 driving hours 15k fully annotated scenes divided into 5 categories (cars, buses, trucks, pedestrians, cyclists people) Diverse environments (day/night, sunny/rainy, urban/suburban).

● Download address: https://opendatalab.org.cn/ONCE

● Paper address: https://arxiv.org/pdf/2106.1103

13.Cityscape

● Publisher: TU Darmstadt Max Planck Institute for Informatics

● Release time: 2016

● Introduction: Cityscapes is a large-scale database focused on pairing 城市街道场景的语义理解. 平面,人类,车辆,构造,对象,自然,天空和虚空)It provides semantic, instance and dense pixel annotations for 30 classes divided into 8 categories (. The dataset consists of approximately 5000 finely annotated images and 20000 coarsely annotated images. During several months, daytime and good weather conditions Next, the data was captured in 50 cities. It was originally recorded as a video, so frames were manually selected to have the following features: a large number of dynamic objects, different scene layouts and different backgrounds.

● Download address: https://opendatalab.org.cn/CityScapes

● Paper address: https://arxiv.org/pdf/1604.0168

14.YouTube Driving Dataset

● Publisher: Chinese University of Hong Kong · University of California

● Release time: 2022

● Introduction: Grab first view driving videos from YouTube. 134 videos with a total length of more than 120 hours were collected. The videos cover different driving scenarios with various weather conditions (sunny, rainy, snowy, etc.) and regions (rural and urban areas). One frame is sampled every second, resulting in a data set of 1.3 million frames. Divide the YouTube driving dataset into a training set with 70% data and a test set with 30% data, and perform ACO training on the training set.

● Download address: https://opendatalab.org.cn/YouTube_Driving_Dataset

● Paper address: https://arxiv.org/pdf/2204.02393.pdf

15. A2D2

● Publisher: Audi

● Release time: 2020

● Introduction: We have released the Audi Autonomous Driving Dataset (A2D2) to support startups and academic researchers working on autonomous driving. Equipping vehicles with multimodal sensor suites, recording large datasets, and labeling them is time-consuming and laborious. The A2D2 dataset removes this high barrier to entry and allows researchers and developers to focus on developing new technologies. The dataset has 2D语义分割,3D点云,3D边界框和车辆总线数据.

● Download address: https://opendatalab.org.cn/A2D2

● Paper address: https://arxiv.org/pdf/2004.0632

16. Cam2BEV

● Publisher: RWTH Aachen University

● Release time: 2020

This dataset contains two synthetic, semantically segmented subsets of road scene images that were developed and applied for the one described in the paper "A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented". method created. This dataset is available through the official code implementation of the Cam2BEV method described on Github.

Dataset link: Cam2BEV-OpenDataLab

17.SemanticKITTI

● Publisher: University of Bonn

● Release time: 2019

This is a large-scale dataset based on the KITTI Vision Benchmark and uses all sequences provided by the odometry task. We provide dense annotations for each individual scan of sequence 00-10, which enables the use of multiple sequential scans for semantic scene interpretation, such as semantic segmentation and semantic scene completion. The remaining sequences, namely sequences 11-21, are used as the test set, showing a large number of challenging traffic situations and environment types. Labels for the test set were not provided, we used the evaluation service to score submissions and provide test set results.

● Download link: https://opendatalab.org.cn/SemanticKITTI

● Paper address: https://arxiv.org/pdf/1904.0141

18. OpenLane

● Publisher: Shanghai Artificial Intelligence Laboratory · Shanghai Jiao Tong University · SenseTime Research Institute

● Release time: 2022

OpenLane is the first real-world and largest to date 3D 车道数据集. Our dataset collects valuable content from the public perception dataset Waymo Open Dataset and provides Lane and Closest Path Object (CIPO) annotations for 1000 road segments. In short, OpenLane has 200K frames and over 880K carefully annotated lanes. We made the OpenLane dataset publicly available to help the research community advance 3D perception and autonomous driving technologies.

● Download link: https://opendatalab.org.cn/OpenLane

● Paper address: https://arxiv.org/pdf/2203.11089.pdf

19. OpenLane-V2

● Publisher: Shanghai Artificial Intelligence Laboratory

● Release date: 2023

The world's first road structure perception and reasoning benchmark for autonomous driving. The first task of the dataset is scene structure perception and reasoning, which requires the model to be able to recognize the drivable status of lanes in the surrounding environment. The tasks of this dataset include not only lane centerline and traffic element detection, but also topological relationship recognition of detected objects.

● Download address: https://opendatalab.org.cn/OpenLane-V2

Guess you like

Origin blog.csdn.net/weixin_38346042/article/details/132262368