Compared with the data set listed in the previous bullet ( [Common Datasets for Image Algorithms] Organize the first bullet _izzz_iz's blog-CSDN blog ), the data set listed is more focused on training (training) algorithms, and the data set capacity is larger.

Table of contents

DIV2K

SID

BSDS500

Dataset introduction:

The BSDS500 dataset is a dataset developed by Berkeley Vision and Learning Center (BVLC) that is commonly used for training and evaluating image restoration algorithms. It contains 500 images with various image degradations like noise, blur, lossless compression, etc. These images come from different types of images including cities, buildings, landscapes, nature and more. These images are all collected from the real world with high complexity and authenticity. The BSDS500 dataset is widely used for training and evaluating image restoration algorithms such as denoising, deblurring, and super-resolution, and is widely used in academia and industry.

Dataset source

Link:

UC Berkeley Computer Vision Group - Contour Detection and Image Segmentation - Resources

Producer: Berkeley Vision and Learning Center (BVLC)

Original article: "A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics" by Martin J. Fowlkes, David Martin, and Jitendra Malik, International Journal of Computer Vision, 2001

REDS

Dataset introduction:

The REDS dataset is a dataset for training super-resolution algorithms, which was developed by the organizers of the NTIRE 2019 Global Super-resolution Challenge. The dataset contains a total of 400 video sequences with a total of about 120,000 frames, which include different types of images, such as cities, buildings, landscapes, nature, etc. These images are all from the real world, with high complexity and authenticity.

Each video sequence in the REDS dataset contains images of three different resolutions, namely the original high-resolution image, low-resolution image, and intermediate-resolution image. Such data organization can be used to train super-resolution algorithms and evaluate their performance at different resolutions.

Dataset source

Link:

NTIRE2019: New Trends in Image Restoration and Enhancement workshop and challenges on image and video restoration and enhancement

Producer: Organizer of the NTIRE 2019 Global Super Resolution Challenge

Original article: "NTIRE 2019 Challenge on Video Super-Resolution: Dataset and Study" by Toni Heittola, Annamaria Mesaros, and Toni Korpipaa et al. Published in CVPR2019 in 2019

DIV2K

This dataset is used for training and evaluating super-resolution algorithms, and contains 2000 high-resolution images.

Dataset introduction:

The DIV2K dataset is a dataset commonly used for training and evaluating super-resolution algorithms. It contains 2000 high-resolution images from various types of images such as cities, buildings, landscapes, nature and more. These images are all from the real world, with high complexity and authenticity. Among them, 800 images are used for training and the remaining 1200 images are used for testing.

The DIV2K dataset also provides a low-resolution image generation method, which can easily generate images of different resolutions for training and evaluating super-resolution algorithms.

Dataset source

Link: DIV2K Dataset

Production institution: jointly developed by several research institutions and academic teams (the University of Leuven in Belgium, ETH Zurich in Switzerland, and the University of Brussels in Belgium)

Original article: "CVPR 2017 - NTIRE: New Trends in Image Restoration and Enhancement workshop and challenges" in CVPR 2017 by Christian Timofte, Eirikur Agustsson, Radu Timofte, Luc Van Gool.

SID

Dataset introduction:

The SIDD dataset is a dataset commonly used for training and evaluating image denoising and denoising algorithms. It contains 800 high-resolution images from various types of images such as cities, buildings, landscapes, nature, and more. These images are all from the real world, with high complexity and authenticity. 400 images are used for training and the remaining 400 images are used for testing.

The SIDD dataset also provides different types of noise, such as Gaussian noise, salt and pepper noise, etc., as well as different levels of noise ratios, which can easily evaluate the performance of image denoising and noise reduction algorithms.

Dataset source:

Link:

UC Berkeley Computer Vision Group - Contour Detection and Image Segmentation - Resources

Production Institution: Computer Vision Research Team at UC Berkeley

Original article: "A Benchmark Dataset and Evaluation Methodology for Image Denoising" by K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang.

Sample example:

Urban 100

Dataset introduction:

The Urban 100 dataset is an image super-resolution dataset for urban environments. It contains 100 high-resolution images from urban environments such as buildings, streets, etc. These images are all from the real world, with high complexity and authenticity. Each image in the Urban 100 dataset is manually annotated by professionals, and images of different resolutions are provided, such as images with 4 times or 8 times lower resolution than the original image. Such data organization can be used for training and evaluating super-resolution algorithms.

Dataset source:

Link: https://github.com/Tong-Zhang/Urban-100

Production institution: University of Leuven, Belgium, and Institute of Automation, Chinese Academy of Sciences jointly developed

Original article: "Urban 100: A New Dataset and Benchmark for Urban Image Super-Resolution" by Tong Zhang, Shuhang Gu, Radu Timofte.

Kodak

Dataset introduction:

The dataset contains 24 high-resolution images, each of which is provided with 2x or 3x lower resolution compared to the original image. Such data organization can be used to train image inpainting algorithms, where the algorithm takes a low-resolution image as input and outputs a high-resolution image. After training, the algorithm can be run on new images and be able to generate high-resolution images.

In practical applications, since the Kodak dataset only contains 24 images, it is usually used to evaluate image restoration algorithms, especially in the field of lossy image compression.

Dataset source

Product: Eastman Kodak Company | Kodak

Production Agency: Kodak Company

Original article: "Kodak PhotoCD Image Database" Authors: ND Holland, JB Schwengerdt, EM Reiman, JC Galloway

Sun

Dataset introduction:

The Sun dataset is a dataset for training and evaluating image super-resolution algorithms. It contains 817 high-resolution images from different types of images such as architecture, landscape, nature, etc. These images are all from the real world, with high complexity and authenticity. Each image in the Sun dataset is provided with 2x or 3x lower resolution compared to the original image. Such data organization can be used for training and evaluating super-resolution algorithms.

The Sun dataset is often used to train super-resolution algorithms in training image inpainting algorithms, which has high complexity and authenticity. Image inpainting algorithms trained by using the Sun dataset are generally able to generate higher quality high-resolution images with higher details and less noise when processing low-resolution images. In practice, the Sun dataset is often used to train image inpainting algorithms and is often compared with other datasets when evaluating its performance on new images.

Dataset source

Link: http://groups.csail.mit.edu/vision/SUN/

Producer: Stanford University

Original text: SUN Database: Exploring a Large Collection of Scene Categories

COCO

Dataset introduction:

The COCO dataset is a dataset for training and evaluating computer vision algorithms, mainly for tasks such as object detection, semantic segmentation, and image keypoint detection. It contains more than 3.3 million high-definition images of 80 categories of objects, including people, cars, chairs, dogs, etc. Each picture in the COCO dataset has a corresponding label, including object category, object location and other information. These labels can help the algorithm learn more features during training.

In training image inpainting algorithms, the COCO dataset can be used to train image denoising algorithms. Since the COCO dataset contains a large number of high-resolution images, and the location information of objects in the images has been marked, it can be used to train the algorithm to remove noise in the image.

Dataset source

链接：COCO - Common Objects in Context

Producer: Microsoft Research

Original article: Microsoft COCO: Common Objects in Context

[Datasets commonly used in image processing algorithms] sorting out the second bullet

BSDS500

REDS

DIV2K

SID

Urban 100

Kodak

Sun

COCO

Guess you like