WAIC2023| Is AIGC doing good or evil?

With the development of digital image processing technology and the explosion of AIGC products represented by , , , etc., various digital image editing and processing software and generative models are becoming more and more powerful, and ordinary users can easily edit ChatGPTimages Midjourney. Stable DiffusionDrawing, editing and tampering, while these technologies and software bring convenience, they also make it easier for users to forge or tamper with digital images without leaving traces. The risks of artificial intelligence-generated content being false, data leakage and infringement are further amplified.

insert image description here

The tampering and forgery of images is easy and difficult to identify with the naked eye. At present, some people with ulterior motives have maliciously forged and tampered with digital images, and used them to falsify administrative statements and contracts, transfer records, transaction records, chat records, identity forgery, Medicine, fake news, etc. disrupt the economic order and social order.
The security and credibility of image content has become the focus of public attention, but "trusted AI" in the image field has just started.

1. Common image tampering techniques

Image tampering technology can be used to deceive, mislead, or destroy the authenticity and credibility of images. The piracy and information security risks brought about by these technologies are often easy to become social problems. To obtain evidence of tampering with digital images, we must first understand digital Image tampering methods:

  1. Synthesis : The synthesis method selects one image or a part of multiple images and stitches them into another image to block the information in the target image or add additional information, which is one of the most commonly used forgery techniques in image forgery;
  2. Image retouching : mainly refers to the use of image editing tools to beautify, stretch, and polish the content of the image, thereby hiding some important details in the image or repairing some damaged images, such as using Photoshop, Meitu and other image editing tools to edit images to modify.
  3. Generative model : The generative model is represented by Midjourneyand Stable Diffusion. By using deep learning technology to learn a large amount of image data, this model can generate very realistic and high-quality images, and it is difficult for the human eye to distinguish whether it is real or generated.
  4. Image translation : mainly takes one image as the image that needs to be changed, and another image as the image that needs to be changed, and gradually changes one image into another.
  5. Image enhancement : Image enhancement is to add some information or transform data to the original image by certain means, selectively highlight the features of interest in the image or suppress some unwanted features in the image, so as to emphasize the overall or local features of the image purposefully. .
  6. Manual redrawing : manual drawing through drawing software (such as Photoshop, CADetc.) or other drawing tools.

2. Traditional tampered image detection methods

2.1. Splicing image tampering detection method based on light source and noise

Image mosaic tampering detection method based on light source color and noise According to the inconsistency of the light source color and noise characteristics of the tampered image , the correct detection and positioning of the stitching area is realized by extracting the mixed features of the two.

First, it divides the color image to be detected SLIC(Simple Linear Iterative Clustering)into non-overlapping superpixel blocks using a simple linear iterative clustering algorithm; secondly, it converts each image block into a YCbCrcolor space to extract the color feature of the light source, and at the same time expresses the image block as a quaternion and uses PCAExtract noise features; then use the combination of these two features as the final feature vector, use K-meansclustering to divide the feature vector into two categories, and mark the category with fewer features as the tampered area to achieve color image stitching tampering detection.

insert image description here

In most spliced ​​falsified images, the proportion of the tampered area is smaller than that of the original area, so the K-meansalgorithm can be used to divide the mixed features SFinto 2 clusters, count the number of superpixel blocks contained in the 2 clusters, and Mark the clusters with a small number as suspicious regions. In the superpixel segmentation results, there is a problem of small block regions, which may cause superpixels that should belong to stitched regions to be unlabeled, or superpixels in original regions to be mislabeled as tampered regions. In response to this problem, in order to further improve the detection accuracy, the clustered initial labeling results are post-processed at the superpixel block level, including isolated block filtering and hole filling.

  • The main steps of isolated block filtering are: traverse all superpixel blocks k(k=1,2,…,K), if the adjacent superpixel blocks are all marked as original regions, mark k as original regions.
  • The main steps of hole filling are: traverse all superpixel blocks k, if all adjacent blocks of superpixel block k are marked as tampered regions, then mark k as a tampered region. The initial classification results are shown in Figure a, and the classification results after isolated block filtering and hole filling are shown in Figure b. Among them, gray represents the original region, and white represents the tampered region.
    insert image description here

2.2. Detection method based on Markov features

Markov features reflect the relationship between each pixel and its adjacent pixels, tampering with unnatural boundaries in the image, blurring, interpolation and other post-processing methods will destroy the distribution characteristics of adjacent pixels in natural images. The steps of Markovian feature extraction are as follows: Firstly, the residual is obtained in the horizontal and vertical directions and the main and subdiagonal directions of the image to be detected, and then the residual image is truncated to reduce the feature dimension. Finally, the transition probability of the adjacent pixel residual truncation value is calculated. This transition probability is the Markov characteristic.

insert image description here

The tampered image detection method based on Markov features is a commonly used image tampering detection method. This method is based on the Markov random field model. By analyzing the pixels of the image, the local features of the image are extracted, and then these features are used to detect image tampering. Detect if an image has been tampered with.

First, the image is divided into several small blocks, and then feature extraction is performed on each small block. These features include grayscale histogram, gradient histogram, color histogram, etc. Then, these features are combined into a feature vector, which is used to represent the feature of the small block. Then, Markov random field model is used to model these features to describe the relationship between pixels in the image, so as to detect the tampered area in the image. However, it is easily affected by factors such as image noise and compression, and is prone to false detection and missed detection, and the detection effect is poor for complex tampering operations, such as image synthesis and image fusion.

3. Image tampering detection method based on deep learning

3.1. Method based on Fisher coding and SVM model

The tampered image detection method based on Fisherencoding and SVMmodel is a commonly used image tampering detection method, which extracts the color channel features corresponding to the real image and the forged image data set, and encodes them separately. Fisher encoding is a feature extraction based on local binary patterns Fisher. (LBP)method, which can extract the texture features of the image, and the model uses the encoded color channel features to train SVMthe model. This method can effectively detect image tampering with high accuracy and robustness. However, feature extraction and feature selection need to be performed on the image, which requires high computational complexity and time cost. The steps are as follows:

  • For the real data set, 5 types of images are selected, and each type of image is encoded Fisherto extract the a and b color channel features in 100 images;
  • For the fake data set, select 5 types of images that are the same as real images, and Fisherextract the a and b color channel features of 100 images after each type of image is encoded;
  • After that, 5 categories are randomly selected as the training and testing data sets of the real and fake data sets, and the a and b color channel features are extracted from the classified images;
  • Encode the extracted color channel features Fisher. Modeling using Fishercoded features SVM;
  • Use the filtered feature training SVMmodel, use the tampered image and the non-tampered image as positive and negative samples for training, and use the trained model SVMto classify the image to be detected to determine whether it is a tampered image.
  • Use images from the test dataset to test SVMthe accuracy of the fitted model.

3.2. Mantra-Net method based on local anomaly feature detection

ManTra-NetThe method consists of two sub-networks, an image processing-tracking feature extractor that creates a unified feature representation and a local anomaly detection network that directly localizes fake regions, learning a decision function mapping from the difference between (LADN)local features and their references to fake labels. Technically, ManTraNetit consists of two sub-networks, as follows:

  1. Image Processing Trajectory Feature Extractor: A feature extraction network for image processing classification tasks that is sensitive to different processing types and encodes image processing in a patch as a fixed-dimensional feature vector.
  2. Local anomaly detection network: The anomaly detection network compares local features with the dominant features averaged over the local area, and its activation depends on how much the local features deviate from the reference features, rather than the absolute value of the local features.
    insert image description here

ManTraNetIt is an end-to-end image forgery detection and localization solution. It detects forged pixels by identifying local abnormal features, so it is not limited to a specific type of forgery or tampering. It is simple, fast and highly robust, but its limitation is that it cannot Accurate detection of multi-tampered object images.

3.2. Image authenticity identification model based on HRNet encoder-decoder structure

At the current World Artificial Intelligence Conference (WAIC 2023), Hehe Information Technology staff proposed an image authenticity identification model based on the encoder-decoder structure of HRNet.
insert image description here

This network structure has great advantages in image authenticity identification, and it can better capture the detailed information in the image. In the based HRNetencoder-decoder structure, the encoder converts the input image into a high-dimensional feature vector, extracts deep feature information, the information includes but not limited to noise, light, spectrum, etc., and the decoder converts these feature vectors into mask analysis , so as to capture fine-grained visual differences and achieve high-precision identification.

4. First experience of image tampering detection method

The following is an implemented PyTorchimage tampering detection code. The basic idea is to use a convolutional neural network (CNN)to learn the features of an image, and then input the extracted features into a classifier to determine whether the image has been tampered with.

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader

class ImageForgeryDetector(nn.Module):
    def __init__(self):
        super(ImageForgeryDetector, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(128 * 8 * 8, 512)
        self.fc2 = nn.Linear(512, 2)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.pool(x)
        x = self.relu(self.conv2(x))
        x = self.pool(x)
        x = self.relu(self.conv3(x))
        x = self.pool(x)
        x = x.view(-1, 128 * 8 * 8)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x
def train(model, train_loader, optimizer, criterion, device):
    model.train()
    running_loss = 0.0
    for i, (inputs, labels) in enumerate(train_loader):
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    return running_loss / len(train_loader)

def test(model, test_loader, criterion, device):
    model.eval()
    correct = 0
    total = 0
    running_loss = 0.0
    with torch.no_grad():
        for i, (inputs, labels) in enumerate(test_loader):
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            running_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    accuracy = 100 * correct / total
    return running_loss / len(test_loader), accuracy
    
def main():# 设置超参数
    batch_size = 32
    learning_rate = 0.001
    num_epochs = 10# 加载数据集
    transform = transforms.Compose([
        transforms.Resize((32, 32)),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ])
    train_dataset = datasets.ImageFolder(root='train', transform=transform)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    test_dataset = datasets.ImageFolder(root='test', transform=transform)
    test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

    # 初始化模型和优化器
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = ImageForgeryDetector().to(device)
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)
    criterion = nn.CrossEntropyLoss()

    # 训练模型for epoch in range(num_epochs):
    train_loss = train(model, train_loader, optimizer, criterion, device)
    test_loss, test_accuracy = test(model, test_loader, criterion, device)
    print('Epoch [{}/{}], Train Loss: {:.4f}, Test Loss: {:.4f}, Test Accuracy: {:.2f}%'.format(epoch+1, num_epochs, train_loss, test_loss, test_accuracy))

    # 保存模型
    torch.save(model.state_dict(), 'model.pth')

However, the structure of the above code model is relatively simple. In actual use, it is found that its accuracy rate is extremely low, it is prone to misjudgment and missed judgment, and it is impossible to accurately find the location of tampering. It is necessary to further improve the accuracy of the algorithm, and for different types of tampering means The robustness to tamper traces needs to be further improved.

insert image description here

Hehe Information provides a mature PS tampering detection system . The system uses the neural network model to capture the subtle traces left by the image during the tampering process. Based on millions of data, it learns the changes in the statistical characteristics of the image after it has been tampered with, and intelligently judges the image. Whether it has been tampered with, in addition, it also supports the detection of copy and paste, splicing, erasure and other forms of tampering and mixed tampering , supports key detection of names, dates of birth, addresses, photos and other areas in materials, and marks PS traces , which supports the display of image area tampering confidence in the form of a heat map, and the detection accuracy far exceeds traditional detection methods and human judgment.

This year, it optimized and upgraded the "black technology" of AI image tampering detection, and expanded its application to "screenshot tampering detection". It can detect multiple screenshots including transfer records, transaction records, chat records, etc., whether it is "cutting out" key elements from the original image and then moving "pasting" to another image tampering method, or "erasing "" "Reprinting" and other methods, image tampering detection technology can "wisdom eye" to identify counterfeit.

insert image description here
The main difficulty of this technology is that, compared with certificate tampering detection, the background of the screenshot has no texture and background color, and the entire screenshot has no lighting difference. Mining the fine-grained differential features of original images and doctored images.

insert image description here

5. Image security technology helps AI to do good deeds

In addition to the image tampering detection technology mentioned above, Hehe Information has also developed AI-generated picture identification technology to help individuals and organizations identify and judge whether AI pictures are generated, and prevent "virtual person" fraud . This technology is based on spatial domain and frequency domain relationship modeling, and can use multi-dimensional features to distinguish subtle differences between real pictures and generated pictures without exhaustively enumerating pictures.

It can be seen that the model has designed multiple spatial attention heads, which can help the model better understand the details and spatial features in the image. Then, the model uses the texture enhancement module to amplify the subtle artifacts in the shallow features. , to enhance the model's perception and judgment accuracy of real faces and fake faces.

insert image description here

On the other hand, Hehe Information has carried out innovative technology exploration and developed OCR anti-attack technology to "encrypt" document pictures. This technology can achieve the goal of "attacking" the content of key information such as Chinese, English, numbers, etc. by disturbing the text of the scene or the text in the document without affecting the viewing and judgment of the naked eye, preventing third parties from passing OCR The system reads and saves all text content in the image . This technology does not affect the viewing and judgment of the naked eye. It can not only meet the needs of information transmission in life and work scenarios, but also reduce the risk of data leakage, so as to achieve the purpose of protecting information. These technologies are helping the healthy development of generative AI by addressing the ethical and moral issues faced by generative AI.

insert image description here

Whether AI is doing good or evil, lawbreakers and guardians stage countless invisible battles in unknown corners every day. Which side's technical strength is stronger is the key to determining whether the Tao is one foot tall or the devil is one foot tall. The importance of image security is increasing day by day, and the introduction of standard specifications is imminent. Together with authoritative institutions such as the China Academy of Information and Communications Technology, Hehe Information joins hands with top domestic universities, research institutions and enterprises to jointly explore the far-reaching proposition of the credible implementation of AI technology in the image field, and to help science and technology develop for good while improving.

おすすめ

転載: blog.csdn.net/air__Heaven/article/details/131731688