[Deep Learning Experiment] Image Processing (4): PIL - Custom image data enhancement operation (image synthesis; image fusion (Gaussian mask))

Article directory

  • 1. Experiment introduction
  • 2. Experimental environment
    • 1. Configure the virtual environment
    • 2. Library version introduction
  • 3. Experimental content
    • 0. Import necessary libraries
    • 1. Basic operations of PIL
    • 2~4. Random occlusion, random erasure, linear blending
    • 5. Image synthesis
      • 5.1 Principle
      • 5.2 Implementation
      • 5.3 Effect display
    • 6. Image fusion
      • 6.1 Principles
      • 6.2 Implementation
      • 6.3 Effect display

1. Experiment introduction

  In deep learning tasks, data enhancement is one of the key steps to improve the generalization ability of the model. By transforming and expanding the training set, the amount of data can be effectively increased, differences between samples can be introduced, and the model can better adapt to different inputs.
  This experiment will continue to implement custom image data enhancement operations, including image synthesis (paste combination), image fusion (create a Gaussian mask to fuse two images )

2. Experimental environment

1. Configure the virtual environment

conda create -n Image python=3.9 
conda activate Image
conda install pillow numpy

2. Library version introduction

software package This experimental version
numpy 1.21.5
python 3.9.13
pillow 9.2.0

3. Experimental content

0. Import necessary libraries

import numpy as np
from PIL import Image

1. Basic operations of PIL

[Deep Learning Experiment] Image Processing (1): Python Imaging Library (PIL) library: image reading, writing, copying, pasting, geometric transformation, image enhancement, image filtering

[Deep Learning Experiment] Image Processing (2): Image processing and random image enhancement in PIL and PyTorch (transforms)

2~4. Random occlusion, random erasure, linear blending

[Deep Learning Experiment] Image Processing (3): PIL - Custom image data enhancement operations (random occlusion, erasure, linear mixing)

5. Image synthesis

5.1 Principle

  1. Input image:

    • Image 1 \text{Image 1}Statue1
      Insert image description here
    • Image 2 \text{Image 2}Statue2
      Insert image description here
  2. Masking and selection:

    • Occlusion area in image 1 x x x
      • Randomly select the area in image 1 to be occluded x x x (introduces variability in training data)
    • Select the corresponding area from image 2 y y y
      • Select the occluded area in image 1 x x x Corresponding area in image 2 y y and
  3. Paste:

    • General y y y pasted into image 1 x x x 移动:
      • The area to be selected from image 2 y y y Paste into the blocked area in image 1 x x The position of x (simulating an image blending effect)
  4. Output:

    • Returns enhanced image 1, which now contains the pasted region y y y

5.2 Implementation

class Combine(object):
    def __init__(self,x_start, y_start, x_end, y_end):
        self.x_start = x_start
        self.y_start = y_start
        self.x_end = x_end
        self.y_end = y_end

    def __call__(self, img1, img2):
        # Masking out a region x of image1
        img1_array = np.array(img1)
        img1_array[self.y_start:self.y_end, self.x_start:self.x_end] = 0
        img1_masked =  Image.fromarray(img1_array.astype('uint8')).convert('RGB')

        # Selecting a region y of the same as x from image2
        region_y = img2.crop((self.x_start, self.y_start, self.x_end, self.y_end))

        # Pasting region y on the location of x of image1
        img1_masked.paste(region_y, (self.x_start, self.y_start))

        return img1_masked

5.3 Effect display

img1 = Image.open('3.png').convert('RGB')
img2 = Image.open('2.png').convert('RGB')
combine = Combine(628, 128, 1012, 512)
img = combine(img1,img2)
img.save('./combine_image.png')

Insert image description here

6. Image fusion

6.1 Principles

  Create a mask with a Gaussian kernel function to fuse between two images.

  1. Adjustment sample x j x_j xj (2.jpg) size to match sample x i x_i xi(1.jpg);
  2. existing x i x_i xi(or x j x_j xj) choose a random position C C C
  3. Create mask using 2D standard Gaussian kernel function G G G, secure center given position C C C 对齐,并且其大小以 x i x_i ximatch;
  4. Use G G G Repair x i x_i xi,Using 1 − G 1-G 1G Repair x j x_j xj
  5. Combine the obtained modifications together to get x ^ \hat x x^
  6. Reply x ^ \hat x x^

6.2 Implementation

class Gaussian(object):
    def __init__(self, sigma):
        # 混合参数
        self.sigma = sigma

    def __call__(self, img1, img2):
        # Choose a random position, labeled as $C$, within $x_i$ (or $x_j$)
        self.size = img1.shape[1], img1.shape[0]
        print(self.size)
        x = np.random.randint(0, img1.shape[1])
        y = np.random.randint(0, img1.shape[0])
        position_c = (x, y)
        print(position_c)

        # Create mask $G$ using a 2D standard Gaussian kernel function,
        # ensuring its center aligns with position $C$, and the size of $G$ matches that of $x_i$

        mask_g = self.gaussian_mask(position_c)
        # print(mask_g.shape)
        mask_g = np.expand_dims(mask_g, axis=2)
        mask_g = np.repeat(mask_g, 3, axis=2)
        # print(mask_g.shape)

        # Use $G$ to modify $x_i$ and use $1-G$ to modify $x_j$
        # Combine the resulting modifications together as $\hat x$
        hat_x = img1 * mask_g + img2 * (1 - mask_g)
        return hat_x

    def gaussian_mask(self, center):
        x, y = np.meshgrid(np.arange(0, self.size[0]), np.arange(0, self.size[1]))
        d = np.sqrt((x - center[0]) ** 2 + (y - center[1]) ** 2)
        gaussian_mask = np.exp(-(d ** 2 / (2.0 * self.sigma ** 2)))
        return gaussian_mask

6.3 Effect display

# Input two images, which are image1 (1.jpg) and image2 (2.jpg)
img1 = Image.open('2.png').convert('RGB')
img2 = Image.open('3.png').convert('RGB')
# Adjust the size of Sample $x_j$ (2.jpg) to match Sample $x_i$ (1.jpg)
img2 = img2.resize(img1.size, Image.Resampling.BICUBIC)
img1 = np.array(img1)
img2 = np.array(img2)
gaussian = Gaussian(300)
img = gaussian(img1,img2)
img = Image.fromarray(img.astype('uint8')).convert('RGB')
img.save('./gaussian_image.png')

Insert image description here

Insert image description here

Insert image description here

Guess you like

Origin blog.csdn.net/m0_63834988/article/details/134717903