Article directory
- 1. Experiment introduction
- 2. Experimental environment
-
- 1. Configure the virtual environment
- 2. Library version introduction
- 3. Experimental content
-
- 0. Import necessary libraries
- 1. Basic operations of PIL
- 2~4. Random occlusion, random erasure, linear blending
- 5. Image synthesis
-
- 5.1 Principle
- 5.2 Implementation
- 5.3 Effect display
- 6. Image fusion
-
- 6.1 Principles
- 6.2 Implementation
- 6.3 Effect display
1. Experiment introduction
In deep learning tasks, data enhancement is one of the key steps to improve the generalization ability of the model. By transforming and expanding the training set, the amount of data can be effectively increased, differences between samples can be introduced, and the model can better adapt to different inputs.
This experiment will continue to implement custom image data enhancement operations, including image synthesis (paste combination), image fusion (create a Gaussian mask to fuse two images )
2. Experimental environment
1. Configure the virtual environment
conda create -n Image python=3.9
conda activate Image
conda install pillow numpy
2. Library version introduction
software package | This experimental version |
---|---|
numpy | 1.21.5 |
python | 3.9.13 |
pillow | 9.2.0 |
3. Experimental content
0. Import necessary libraries
import numpy as np
from PIL import Image
1. Basic operations of PIL
2~4. Random occlusion, random erasure, linear blending
5. Image synthesis
5.1 Principle
-
Input image:
- Image 1 \text{Image 1}Statue1
- Image 2 \text{Image 2}Statue2
- Image 1 \text{Image 1}Statue1
-
Masking and selection:
- Occlusion area in image 1 x x x:
- Randomly select the area in image 1 to be occluded x x x (introduces variability in training data)
- Select the corresponding area from image 2 y y y:
- Select the occluded area in image 1 x x x Corresponding area in image 2 y y and
- Occlusion area in image 1 x x x:
-
Paste:
- General y y y pasted into image 1 x x x 移动:
- The area to be selected from image 2 y y y Paste into the blocked area in image 1 x x The position of x (simulating an image blending effect)
- General y y y pasted into image 1 x x x 移动:
-
Output:
- Returns enhanced image 1, which now contains the pasted region y y y。
5.2 Implementation
class Combine(object):
def __init__(self,x_start, y_start, x_end, y_end):
self.x_start = x_start
self.y_start = y_start
self.x_end = x_end
self.y_end = y_end
def __call__(self, img1, img2):
# Masking out a region x of image1
img1_array = np.array(img1)
img1_array[self.y_start:self.y_end, self.x_start:self.x_end] = 0
img1_masked = Image.fromarray(img1_array.astype('uint8')).convert('RGB')
# Selecting a region y of the same as x from image2
region_y = img2.crop((self.x_start, self.y_start, self.x_end, self.y_end))
# Pasting region y on the location of x of image1
img1_masked.paste(region_y, (self.x_start, self.y_start))
return img1_masked
5.3 Effect display
img1 = Image.open('3.png').convert('RGB')
img2 = Image.open('2.png').convert('RGB')
combine = Combine(628, 128, 1012, 512)
img = combine(img1,img2)
img.save('./combine_image.png')
6. Image fusion
6.1 Principles
Create a mask with a Gaussian kernel function to fuse between two images.
- Adjustment sample x j x_j xj (2.jpg) size to match sample x i x_i xi(1.jpg);
- existing x i x_i xi(or x j x_j xj) choose a random position C C C;
- Create mask using 2D standard Gaussian kernel function G G G, secure center given position C C C 对齐,并且其大小以 x i x_i ximatch;
- Use G G G Repair x i x_i xi,Using 1 − G 1-G 1−G Repair x j x_j xj;
- Combine the obtained modifications together to get x ^ \hat x x^;
- Reply x ^ \hat x x^。
6.2 Implementation
class Gaussian(object):
def __init__(self, sigma):
# 混合参数
self.sigma = sigma
def __call__(self, img1, img2):
# Choose a random position, labeled as $C$, within $x_i$ (or $x_j$)
self.size = img1.shape[1], img1.shape[0]
print(self.size)
x = np.random.randint(0, img1.shape[1])
y = np.random.randint(0, img1.shape[0])
position_c = (x, y)
print(position_c)
# Create mask $G$ using a 2D standard Gaussian kernel function,
# ensuring its center aligns with position $C$, and the size of $G$ matches that of $x_i$
mask_g = self.gaussian_mask(position_c)
# print(mask_g.shape)
mask_g = np.expand_dims(mask_g, axis=2)
mask_g = np.repeat(mask_g, 3, axis=2)
# print(mask_g.shape)
# Use $G$ to modify $x_i$ and use $1-G$ to modify $x_j$
# Combine the resulting modifications together as $\hat x$
hat_x = img1 * mask_g + img2 * (1 - mask_g)
return hat_x
def gaussian_mask(self, center):
x, y = np.meshgrid(np.arange(0, self.size[0]), np.arange(0, self.size[1]))
d = np.sqrt((x - center[0]) ** 2 + (y - center[1]) ** 2)
gaussian_mask = np.exp(-(d ** 2 / (2.0 * self.sigma ** 2)))
return gaussian_mask
6.3 Effect display
# Input two images, which are image1 (1.jpg) and image2 (2.jpg)
img1 = Image.open('2.png').convert('RGB')
img2 = Image.open('3.png').convert('RGB')
# Adjust the size of Sample $x_j$ (2.jpg) to match Sample $x_i$ (1.jpg)
img2 = img2.resize(img1.size, Image.Resampling.BICUBIC)
img1 = np.array(img1)
img2 = np.array(img2)
gaussian = Gaussian(300)
img = gaussian(img1,img2)
img = Image.fromarray(img.astype('uint8')).convert('RGB')
img.save('./gaussian_image.png')