A general approach to detect tampered and generated image

Abstract

The current digital image forensic algorithm focuses on a priori knowledge of the image detecting tampering of a conventional or required GAN network structure.

This innovation proposes a general method of generating tamper image and the GAN to simultaneously detect images. SUMMARY:
1. Scharr operation of an edge information extraction image.
2. Convert to the edge information matrix GLCM (GLCM), without loss of image information in the image compression decentralized.
3. Enter the GLCM separable convolution depth training based on neural network.

Results: Compared with other methods, models on the F1 score gained 0.98 points, while the model for other GAN has strong generalization ability.

关键字：Digital image forensics, generative adversarial networks, deep learning, convolutional
neural networks

1.INTRODUCTION

Digital image forensics divided into active and passive forensic evidence.

In most passive forensic method, the image features are extracted in the image pre-processing, and then using the SVM classification references [5] - [7], [27]. With the development of CNN, CNN has adopted a reference [8], [15].

splicing , Removal , and copy-move is a major aspect of the detection, but also three major aspects discussed herein.

Network GAN citations [2] - [4], [16], [17], [26], [28].

The defects the conventional algorithm, and can not detect a network GAN conventional image tampering and poor generalization ability.

Training data set is generated BigGAN picture.

Paper Code : https://github.com/yuleung/image_forensics

The main contribution of the article :

- The first model the simultaneous detection of GAN generated images and image tampering picture.
- suggest a common method for detecting and GAN GAN network variants generated images.
- We propose a separable convolution depth network structure, compared with conventional convolution, a small number of parameters of the structure.

This section explains the image tampering and GANs network structure.

A. DIGITAL IMAGE TAMPERING DETECTION

Reference [5] proposes splicing traces in the YCrCb color space is easily detected.
Reference [7] steerable pyramid transform (SPT) and local binary pattern (LBP) feature extraction.
Reference [36] proposed detection method of the wavelet transform and texture descriptor. (Wavelet transform and texture descriptors)
Reference [6] The edge image is modeled as a finite state Markov chain, and low-dimensional feature vector from the extracted static distribution for tamper detection. (Model the edge image as a finitestate Markov chain and extract low dimensional feature vector from its stationary distribution for tampering detection)
Reference [8] have proposed a new convolutional layer called constrained convolutional layer (constrained convolution layer), content of the image can be suppressed and adaptively learning tamper detection features.
Reference [15] CNN model proposed use in model space rich (SRM) basic highpass filter provided in the first layer of the convolution kernel initialization. Then SNN convolutional networks for feature extraction.
Reference [10] proposed two flow Faster R-CNN network for detecting a task. Their model consists of two subnets, namely SRM noise current subnet and subnet RGB streams, they are trained to extract different characteristics, respectively. Finally, by analyzing these features to detect the tampered.
Reference [29] using a diffusion residual network variables (DRN-C-26) to detect the face image deformation applied.

B. GANS IN IMAGE GENERATION

GAN network model has PGGAN, SNGAN, BigGANs, StyleGAN, StackGAN, StackGAN.

Some detection GAN network paper [11] - [13], [31], [32], [35]

3. APPROACH

Network structure below

It divided into three parts: feature extraction, classification, and image scaling .

The image from the RGB color space is YCrCb color space, and then use the edge extraction operator Scharr information Cr and Cb components.
Converted into GLCM (GLCM), the edge of the matrix for uniform matrix of different sizes have the same size.
GLCM will be fed into a neural network based on the depth of our depth separable convolution is designed to obtain classification results.

A. FEATURE EXTRACTION

It is a YCrCb color space, similar to RGB, where Y is the luminance classification, Cr and Cb chrominance component.
The first line in Figure 1, will try to tamper with the elimination of tampering on RGB images.
In the first line of FIG. 4, of tampering more evident (smoother edge bird splicing than other portions)
In Figure 3, there is an edge between the image information generated foreground and background GAN.

将图片转换成YCrCb提取Cr和Cb色度分量。Y分量如图4所示，它主要包括内容细节，图像的内容细节将会覆盖篡改的边缘信息，所以，我不这里不使用Y分量。Cr和Cb不关注图像的细节，更加图像的边缘信息。能够提取篡改的关键信息，然后使用3×3的边缘检测算法，获取边缘信息。
scharr是soble的变体，更加注重边缘信息。公式如下

边缘信息的结果如图4所示。可以看出，篡改操作区域的边缘信息比Cr成分和Cb成分的边缘信息矩阵中的真实区域要平滑和明亮。

B. IMAGE SCALING

图像的纹理是通过重复出现在特定空间位置的灰度来形成的，灰度共生矩阵（GLCM）通过提取灰度空间相关性的特征来提取图像的纹理。 我们将图像边缘信息矩阵转换为GLCM的原因有两个：

在图像取证的实际应用中，图像的像素不确定。同时，基于CNN的分类器通常需要输入数据具有特定大小，并且图像的详细信息对于检测篡改操作特别重要。 GLCM的大小取决于图像中的最大灰度值。因此，GLCM可以将边缘信息矩阵调整为统一大小，而不会丢失图像细节。
根据GLCM的特征，边缘图像中与篡改区域相对应的平滑边缘和与未篡改区域相对应的粗糙边缘在GLCM中具有不同的表示形式。

经过Scharr算子滤波后得到的边缘信息矩阵的最大值可以达到4080，但是我们发现，由Scharr算子计算得到的图像的边缘信息矩阵的值大多在相对较小的范围内，如图6所示。我们可以选择合适的阈值来截断边缘信息矩阵的较大值，而对模型的性能影响很小，这可以减小转换后的GLCM的大小，从而最大程度地减少模型的复杂性。截断操作根据以下规则进行：

接下来，我们在四个方向（0°，45°，90°,135°）采用偏移距离为1对Cr和Cb的成分矩阵进行运算。最后，将它们连接在一起以获得大小矩阵T×T×8，作为深度神经网络的输入。

转换为GLCM的特定算法如下

C. CLASSIFICATION BASED ON DEPTHWISE SEPARABLE CONVOLUTION

我们设计了基于深度可分离卷积的深度神经网络，专门用于检测篡改图像和GAN生成的图像。网络结果如图

设计这样的网络结构的原因：

GLCM来自边缘信息矩阵中提取的，这导致存在大量的0，所以我们在第一层的卷积核设置为5×5，步长为4.
在将边缘信息矩阵转换为GLCM之后，边缘的特征将分散在整个GLCM中。因此，在网络体系结构的较深部分中，在获得较小的特征图之后，我们重复了几次卷积运算以完全提取图像特征。

4. EVALUATION

A. DATASET

CACIA 2.0
GPIR dataset
COVERAGE dataset
BigGANs dataset
LSUN Bedroom dataset (256×256)
PGGAN dataset
SNGAN dataset
StyleGAN dataset

B. EXPERIMENTAL DETAILS

框架:tensorflow
网络参数细节：ADAM optimizer is used to minimize the cross entropy loss with an initial learning rate of 0.0005, and decay of learning rate 0.85 every 600 steps, a minibatch size of 56, a batch normalization decay parameter of 0.95, and a weight decay(L2 regularization) parameter of 0.0001.
RGB转换为YCrCb原则

同时检测GAN网络和篡改图像的多分类问题使用Macro-F1评分来评估我们的模型

准确率衡量为

C. DETECTION PERFORMANCE

在CACIA2.0和BigGANs数据集上进行了实验，在测试边缘信息矩阵的时候使用了不同的阈值，不同颜色空间，不同深度神经网络用于分类的性能差异以及Sobel运算符和Scharr运算符之间的性能差异。

我们从7491个真实图像和5123个CASIA 2.0篡改图像中随机选择5123个图像，并从16,000个BigGANs数据集中的图像中随机选择5123个图像。然后，我们从每个类别中随机选择4123张图像，总共12369张图像作为训练集，从每个类别中剩余的1000张图像和总共3000张图像作为测试集。实验结果示于表1。

我们发现截断值T为192可获得最佳结果，截断值小于192导致性能降低，截断值大于192并没有带来更高的性能。另外，对于篡改图像和GAN生成的图像，我们发现在处理伪造图像检测的边缘检测任务时，我们的方法中使用的Scharr运算符具有比Sobel运算符更好的性能。与传统的卷积网络模型和其他经典的网络模型相比，基于深度可分离卷积的深度神经网络模型具有更好的分类性能。与其他颜色空间中的分量相比，YCrCb颜色空间中的Cr和Cb分量在伪图像检测任务中可以更好地执行特征提取。与[11]，[31]和XceptionNet中提出的方法相比，我们的方法具有更好的性能。

同时，可以注意到，在某些特定条件下，我们的模型达到了很高的精度和召回率（甚至达到100％）。并且使用不同的参数，我们的模型都能很好地检测BigGAN生成的图像。因此，有了这些结果，我们可以有把握地得出结论，我们的模型可以很好地提取GAN生成的图像的特征。

与其他专门设计用于检测篡改的方法相比，我们的通用模型在检测篡改图像方面也具有良好的性能。实验结果示于表2。

在表1的最佳组合中，如果根据规则（9）计算准确性，则仅考虑真实图像和篡改图像，而忽略GAN生成的图像，我们的模型在BigGANs数据集和CASIA2上的检测精度得到了训练。 0数据集是97.95％。如果像[15]，[27]和[6]一样仅在CASIA2.0数据集上训练模型，则检测精度为99.25％，[15]，[27]和[6]中报告的准确性分别为97.83％，97.50％和95.60％。此外，如果我们的模型在CASIA1.0数据集上进行了训练和测试，则可以将CASIA1.0数据集视为CASIA2.0数据集的简化版本，检测精度可以达到100％，并且在[15]，[ 27]和[36]分别为98.04％，97.00％和96.81％。因此，与以前的工作相比，我们的方法在检测篡改图像方面效果很好。

D. GENERALIZABILITY

1) Generalizability on Tampered images

我们在COVERAGE数据集和GIRP数据集中测试了模型的可泛化性。具体来说，我们首先分别对两个数据集评估我们的方法，然后对两个数据集执行交叉评估（一个数据集作为训练，另一个作为测试）。由于COVERAGE数据集和GPIR数据集的样本有限，因此它们太小而无法重新训练我们的深度神经网络。我们使用了从COVERAGE篡改图像或GPIR篡改图像中随机选择的50％图像来微调原始训练模型，其余50％图像和另一个数据集进行测试。

我们仅以0.0001的学习速率和8的最小批量大小执行了150步的参数更新操作。实验结果如表3所示。

实验表明，我们的模型可以快速且轻松转移到其他小型新颖篡改数据集。

2) Generalizability on GANs Generalized Images

Our method for the detection of various GAN generic models. We chose GAN model has good image quality generated to test the performance of our model, and the accuracy of the calculation using the rule (9). Please note, that we used to evaluate the model using the generalization performance is Cr and Cb components in the experiment of Table 1, Scharr operator, T cutoff is 192, the depth of the separable convolution model training, and also data BigGANs set and data set CASIA 2.0. The results are shown in Table 4.

Experiments show that our model has a strong generalization ability, can detect image GAN generated. This is an amazing discovery, which means that the model generated by the various GAN image with its inherent traits in common, and our model is also very good understanding of these characteristics.

V. CONCLUSION

In this paper, we propose a general model can detect tampered images and GAN generated images. First, we convert the RGB image to be detected as YCrCb color space, and extracting the image edge Cr component and Cb component. Then, we convert the GLCM wherein the image edges, so that image scaling is not lost in the case where the image information is adjusted. Finally, we will GLCM input for training and detection based on neural network depth separable convolution depth design. We designed edge feature extraction method and neural network may recognize the depth image tampering and GAN generated image, the higher the images F1 average macroscopic scores as 0.9865. In addition, compared with the previous work, our model only tamper detection image has also made good performance. In addition, our model can simultaneously detect high-precision images of different GAN model generated from scratch, we think the reason is GAN-generated images will leave marks on the edges of objects, and a good understanding of our model of this mark.