Summary of Keypoint Detection Open Source Datasets

Hand Pose Keypoint Detection Dataset

Dataset download link: http://u3v.cn/6d3lZV

Datasets consist of sequences. In each sequence you can find the frames that make it up. A frame consists of 4 color images, 4 sets of 2D joints projected in each image plane, 4 bounding boxes, 1 set of 3D points provided by the Leap Motion Controller, and 4 sets of 3D points reprojected to each camera coordinate frame .

Animal Pose Dataset

Dataset download link: http://u3v.cn/6kDLfr

The dataset provides animal pose annotations for five categories: dog, cat, cow, horse, sheep, with a total of more than 6,000 instances across 4,000+ images. In addition, this dataset also contains bounding box annotations for 7 other animal categories. Find details in the paper.

A total of 20 key points are marked: two eyes, throat, nose, withers, two ear bases, tail base, four elbows, four knees, and four paws.

Movie character joint key point data set

Dataset download link: http://u3v.cn/5tW5zx

This dataset automatically collects 5003 image data from popular Hollywood movies. The images were obtained by running a state-of-the-art person detector on every 10th frame of 30 movies. Those detected with high confidence (approximately 20,000 candidates) were then sent to the crowdsourcing marketplace Amazon Mechanical Turk to obtain ground truth labels. Each image was annotated by five Texans for $0.01 to annotate 10 upper body joints. Take the median of five markers in each image to be robust to outlier annotations.

MPIIGaze Dataset

Dataset download link: http://u3v.cn/5BsiEe

The MPIIGaze dataset contains 213,659 images collected from 15 participants over three months of daily laptop use. In terms of appearance and lighting, the dataset is more variable than existing datasets.

Human Foot Keypoint Dataset

Dataset download link: http://u3v.cn/5IYvIV

Existing human pose datasets contain limited types of body parts. The MPII dataset annotates ankles, knees, hips, shoulders, elbows, wrists, neck, torso, and top of the head, while COCO also includes some facial keypoints. For both datasets, foot annotations are limited to ankle locations. However, graphics applications such as avatar retargeting or 3D human shape reconstruction require foot key points such as the big toe and heel. In the absence of foot information, these methods suffer from issues such as candy-wrapping effects, floor penetration, and foot skating. To address these issues, a small fraction of foot instances in the COCO dataset are labeled using the Clickworker platform. It is split into 14K annotations from the COCO training set and 545 annotations from the validation set. A total of 6 foot keypoints are marked. Consider the 3D coordinates of foot keypoints instead of surface locations. For example, for exact toe locations, the dataset labels the area between the nail and skin junction, and also takes depth into account by labeling the center of the toe instead of the surface.

Crowd Pose Dataset

Dataset download link: http://u3v.cn/65x8MQ

Multi-person pose estimation is the basis of many computer vision tasks and has achieved significant progress in recent years. However, few methods have previously studied the problem of pose estimation in crowded scenes, and it remains a challenging and unavoidable problem in many scenes. Furthermore, current benchmarks cannot properly assess such cases. In this paper, we propose a new effective approach to the problem of pose estimation in crowds and a new dataset to better evaluate the algorithm.

Guess you like

Origin blog.csdn.net/Extremevision/article/details/127206006