Hello everyone, I am Weixue AI. Today I will introduce the application of computer vision 5-Picasso-style image migration using the VGG model. This article will use the VGG model to realize the method of Picasso-style image migration. First, we will briefly explain the principle of image style transfer, and then use the PyTorch framework to implement the algorithm of Picasso-style image transfer step by step. Finally, we will present experimental results to verify the effectiveness of the algorithm.
Table of contents
I. Introduction
2. The principle of image style transfer
2.1. VGG Network
2.2. Content Loss
2.3. Style Loss
2.4. Total Loss
3. Algorithm implementation
Four. Summary
I. Introduction
Image style transfer is a technique that applies the artistic style of one image to another, resulting in images with different artistic styles. Among them, CNN-based style transfer technology is a relatively common method. The basic idea of this method is to learn how to separate the content information and style information of the input image by adding a style-related loss function to the convolutional neural network, and recombine the two to generate a new image. Image.
The implementation process of the Picasso style transfer algorithm can be divided into the following steps:
1. Use the convolutional neural network VGG to preprocess the input image and Picasso's artistic style image. The purpose of this step is to extract the features of the input image and the artistic style image as the basis for subsequent migration.
2. Define two loss functions: content loss and style loss. The content loss is used to preserve the content information of the input image, while the style loss is used to capture the texture, color and detail information in the Picasso style.
3. The weighted sum of the two loss functions is used to obtain the total loss function, and the total loss function is minimized through optimization algorithms such as stochastic gradient descent, so as to achieve the purpose of applying Picasso's artistic style to the input image.
4. For a new input image, use the trained model for style transfer.
Note: The loss function in the Picasso style transfer algorithm needs to use a pre-trained VGG network, etc. In addition, when implementing the algorithm, the choice of some hyperparameters will also affect the results, such as the weight of content and style loss, learning rate, number of training iterations, etc. Therefore, it is necessary to perform certain parameter adjustment operations when implementing the algorithm to obtain a better migration effect.
Image style conversion:
2. The principle of image style transfer
2.1. VGG network
We use a pretrained VGG-19 network as a feature extractor, which can capture both content and style features of images. The structure of the VGG-19 network is relatively simple. Its name is a combination of the number of layers and the number of units. There are 19 layers in total (including convolutional layers, pooling layers and fully connected layers), of which 13 are convolutional. Layers, 5 are pooling layers, 1 is a global average pooling layer, and finally connected to a fully connected layer as a classifier.
2.2. Loss of content
Content loss measures the difference between the output image and the feature representation of the content image at a certain layer. We usually use higher-level feature representations that preserve the overall content of the image.
where is the content image, is the output image, and is the feature representation of the given image at the layer .
2.3. Style Loss
The style loss measures the difference between the output image and the feature representation of the style image at each layer. We usually use the Gram matrix to measure style features.
where is the style image, is the Gram matrix of the given image in the layer , and is the number of channels of the layer and the size of the feature map, respectively .
2.4. Total loss
Our goal is to minimize the weighted sum of content loss and style loss.
where and are the weights of content loss and style loss.
3. Algorithm implementation
import torch
import torchvision.transforms as transforms
from PIL import Image
def load_image(image_path, max_size=None, shape=None):
image = Image.open(image_path)
if max_size:
scale = max_size / max(image.size)
size = tuple([int(dim * scale) for dim in image.size])
image = image.resize(size, Image.ANTIALIAS)
if shape:
image = image.resize(shape, Image.LANCZOS)
transform = transforms.Compose([
transforms.ToTensor()
])
image = transform(image)[:3, :, :].unsqueeze(0)
return image
def deprocess(tensor):
transform = transforms.Compose([
transforms.Normalize((-0.485 / 0.229, -0.456 / 0.224, -0.406 / 0.225),
(1 / 0.229, 1 / 0.224, 1 / 0.225)),
transforms.ToPILImage()
])
if tensor.dim() == 4:
# If we have a batch of images
output = []
for image in tensor:
image = image.clone().detach().cpu()
image = image.squeeze(0)
image = transform(image)
output.append(image)
return output[0]
elif tensor.dim() == 3:
# If we have a single image
tensor = tensor.clone().detach().cpu()
tensor = tensor.squeeze(0)
tensor = transform(tensor)
return tensor
else:
raise ValueError("Expected input tensor to be 3D or 4D")
return transform(tensor)
import torch.nn as nn
import torchvision.models as models
class StyleTransferModel(nn.Module):
def __init__(self, content_layers, style_layers):
super(StyleTransferModel, self).__init__()
self.vgg = models.vgg19(pretrained=True).features
self.content_layers = content_layers
self.style_layers = style_layers
def forward(self, x):
content_features = []
style_features = []
#print(list(self.vgg.named_children()))
for name, layer in self.vgg.named_children():
x = layer(x)
if name in self.content_layers:
content_features.append(x)
if name in self.style_layers:
style_features.append(x)
return content_features, style_features
def gram_matrix(tensor):
_, c, h, w = tensor.size()
tensor = tensor.view(c, h * w)
gram = torch.mm(tensor, tensor.t())
return gram
import torch.optim as optim
def style_transfer(content_image_path, style_image_path, output_image_path, max_size=400, content_weight=1, style_weight=1e6, iterations=600):
content_image = load_image(content_image_path, max_size=max_size)
style_image = load_image(style_image_path, shape=content_image.shape[-2:])
output_image = content_image.clone().requires_grad_(True)
model = StyleTransferModel(content_layers=['10'], style_layers=['0','2','5','7','12'])
#model.to(device)
content_features = model(content_image)[0]
style_features = model(style_image)[1]
style_grams = [gram_matrix(feature) for feature in style_features]
optimizer = optim.Adam([output_image], lr=0.01)
for i in range(iterations):
output_features = model(output_image)
content_output_features = output_features[0]
style_output_features = output_features[1]
content_loss = 0.0
style_loss = 0.0
for target_feature, output_feature in zip(content_features, content_output_features):
content_loss += torch.mean((output_feature - target_feature) ** 2)
for target_gram, output_feature in zip(style_grams, style_output_features):
output_gram = gram_matrix(output_feature)
style_loss += torch.mean((output_gram - target_gram) ** 2) / (output_gram.numel() ** 2)
total_loss = content_weight * content_loss + style_weight * style_loss
optimizer.zero_grad()
total_loss.backward(retain_graph=True)
optimizer.step()
if (i + 1) % 5 == 0:
print(f"Iteration {i + 1}/{iterations}: Loss = {total_loss.item()}")
output_image = deprocess(output_image)
print(output_image)
output_image.save(output_image_path)
content_image_path = "123.png"
style_image_path = "style.png"
output_image_path = "out.png"
style_transfer(content_image_path, style_image_path, output_image_path)
We only need to input the picture 123.png to be migrated, the style of the picture style.png, and the picture can be generated
4. Summary
This paper introduces in detail the principle and implementation method of Picasso-style image transfer based on CNN network, and implements a simple and effective algorithm using the PyTorch framework. Experimental results show that the method can successfully apply the Picasso style to arbitrary images and generate high-quality artwork.
Past works:
Deep Learning Practical Projects
1. Deep learning practice 1-(keras framework) enterprise data analysis and prediction
2. Deep learning practice 2-(keras framework) enterprise credit rating and prediction
3. Deep Learning Practice 3-Text Convolutional Neural Network (TextCNN) News Text Classification
5. Deep Learning Practice 5-Convolutional Neural Network (CNN) Chinese OCR Recognition Project
7. Deep learning practice 7-Sentiment analysis of e-commerce product reviews
8. Deep Learning Combat 8-Life Photo Transformation Comic Photo Application
9. Deep learning practice 9-text generation image-local computer realizes text2img
12. Deep Learning Practice 12 (Advanced Edition) - Using Dewarp to Correct Text Distortion
17. Deep Learning Practice 17 (Advanced Edition) - Construction and Development Case of Intelligent Assistant Editing Platform System
18. Deep Learning Combat 18 (Advanced Edition) - 15 tasks of NLP fusion system, which can realize the NLP tasks you can think of on the market
28. Deep learning practice 28-AIGC project: use ChatGPT to generate customized PPT files
(pending upgrade)