The latest research progress of Huawei's Noah Lab | ImageNet in the AIGC era, millions of generated images help the development of AI-generated image detectors

Title: GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image
Paper: https://arxiv.org/abs/2306.08571
Code: https://github.com/GenImage-Dataset/GenImage

guide

In this era of AIGC explosion, everyone can use AI algorithms to generate high-quality text, images, and audio content. Among them, the images produced by image generation methods such as Midjourney and Stable Diffusion are amazingly realistic. It is already difficult for the human eye to distinguish between true and false. This can't help but arouse people's hidden worries: a large number of false pictures will be widely disseminated on the Internet. The proliferation of false images can cause various social security issues. For example, fake news can disrupt social order and confuse the public. Malicious fake face pictures will lead to financial fraud and cause a crisis of trust.

For example, the image below is a Trump arrest image generated by Midjourney. Such images are widely circulated on social media and have a negative impact on the political sphere. Therefore, it is very necessary to effectively supervise these AI-generated images.

Figure 2 AI-generated picture of Trump being arrested

Considering that it is difficult for human eyes to distinguish real and fake pictures, we urgently need an AI-generated image detector to distinguish AI-made images from real images. However, the absence of large-scale datasets now hinders the development of detectors. Therefore, we propose a million-scale GenImage dataset, dedicated to building ImageNet in the AIGC era.

Dataset introduction

Table 1 Overview of fake image detection datasets

In the past, the industry also launched some data sets. They mainly have three characteristics. The first is that the data size is small, the second is based on GAN, and the third is limited to face data. As time goes by, the data scale is slowly increasing, the generator is also transitioning from the GAN era to the Diffusion era, and the data range is also increasing. However, a large-scale, Diffusion model-based dataset covering all kinds of general images is still missing.

Based on this, we propose a genimage dataset for benchmarking imagenet. Real pictures use ImageNet. Fake images are generated using ImageNet labels. We utilize eight advanced generators to generate, namely Midjourney, Stable Diffusion V1.4, Stable Diffusion V1.5, ADM, GLIDE, Wukong, VQDM and BigGAN. The total number of images generated by these generators is basically consistent with the real images. The number of pictures generated by each generator is also basically the same. The number of pictures generated by each category is basically the same.

This dataset has the following advantages:

  1. Huge amount of data: over a million image pairs.
  2. Rich picture content: built with ImageNet, with rich labels
  3. Advanced generators: cover Midjourney, Stable Diffusion and other Diffusion generators.

Detectors in the real world often encounter various difficulties. We found through experiments that detectors tend to degrade significantly in two situations. The first is when faced with images generated by the generator that did not appear in the training set. The second is in the face of degraded images. For example, after CNNSpot is trained on Stable Diffusion V1.4, it only has an accuracy rate of 52.8 on Midjourney. When the training and test generators are both Stable Diffusion V1.4, CNNSpot's accuracy rate is only 77.9 in the face of blurred images. Based on this, we propose two challenges to the detector based on this dataset:

  1. Cross Generators: The detector is trained on data generated by one generator and validated on data generated by the other generator. The purpose of this task is to examine the generalization ability of the detector on different generators.
  2. Degraded image recognition: The detector needs to recognize low-resolution, blurred and compressed images. This task mainly examines the generalization problem of the detector when faced with low-quality images in real conditions (such as spreading on the Internet).

We believe that the proposed dataset will greatly help people develop AI-generated image detectors.

experiment

We did some experiments to examine this data set, and we found that the ResNet-50 model trained on one generator has significantly lower test accuracy on other tests. However, in real situations, it is difficult for us to know what the generator of the encountered image is. Therefore, the detector is very important for the generalization ability of different generators to generate images.

Table 2 Cross-validation on different generators using ResNet 50

We compared the results of existing methods trained on Stable Diffusion V1.4 and then tested on various generators, see Figure 3. We also evaluate the results of training on various generators and then testing on various generators. See Figure 4. In Figure 4, each data point in the column of Testing Subset is the average result obtained by training on eight generators and then testing on one generator. We then average the results on these test sets to get the average result on the far right.

Table 3 Trained on Stable Diffusion V1.4, tested on different test sets

Table 4 Training on different generators and testing on different test sets

We degenerate the test set, using low resolution under different parameters, JPEG compression and Gaussian blur, the evaluation
results are as follows

Table 5 Verification results on different degraded images

So is it useful to collect so much data? We have done related experiments to prove that we can improve the performance by increasing the data analogy and the number of pictures in each category.

Table 6 Results of increasing the number of images

Regarding the generalization ability of the GenImage dataset for different pictures, we found that it can also achieve good results for human faces and art pictures.

Table 7 The results of generalization to art and face images

Figure 3 The art and face pictures used in the test

Summarize

With the continuous improvement of the ability of AI to generate pictures, the demand for effective detection of pictures generated by AI will become more and more urgent. This dataset is dedicated to providing effective training data for generated image detection in real environments. We use ResNet-50 trained on this dataset and then detect it on real tweets. As shown in Figure 4 below, ResNet-50 can effectively identify real and fake images. This result demonstrates that GenIamge can be used to train models to detect disinformation in the real world. We believe that the future direction of this field is to continuously improve the accuracy of the detector on the GenImage dataset, and then improve its ability to face false information in the real world.

Figure 4 Display of real tweets

Guess you like

Origin blog.csdn.net/CVHub/article/details/131625499