[Deep Learning Experiment] Image Processing (3): PIL - Custom image data enhancement operations (random occlusion, erasure, linear mixing)

1. Experiment introduction

  In deep learning tasks, data enhancement is one of the key steps to improve the generalization ability of the model. By transforming and expanding the training set, the amount of data can be effectively increased, differences between samples can be introduced, and the model can better adapt to different inputs.
  This experiment will implement custom image data enhancement operations, including Cutout (occlusion), Random Erasing (random erasure) and Mixup (mixing) a>.

2. Experimental environment

1. Configure the virtual environment

conda create -n Image python=3.9 
conda activate Image
conda install pillow numpy

2. Library version introduction

software package This experimental version
numpy 1.21.5
python 3.9.13
pillow 9.2.0

3. Experimental content

0. Import necessary libraries

import numpy as np
from PIL import Image
import random

1. Basic operations of PIL

[Deep Learning Experiment] Image Processing (1): Python Imaging Library (PIL) library: image reading, writing, copying, pasting, geometric transformation, image enhancement, image filtering
[Deep Learning Experiment] Image Processing (2): Image processing and random image enhancement in PIL and PyTorch (transforms)

2. Cutout

2.1 Principle

  The Cutout operation is to randomly select one or more square areas on the image and set the pixel values ​​of these areas to zero to achieve the effect of occlusion. This operation helps the model to be robust to the absence of certain areas, allowing the model to pay more attention to other parts of the image.

2.2 Implementation

class Cutout(object):
    def __init__(self, n_holes, length):
        self.n_holes = n_holes
        self.length = length

    def __call__(self, img):
        h, w, c = img.shape
        mask = np.ones((h, w), np.float32)

        for _ in range(self.n_holes):
            y = np.random.randint(h)
            x = np.random.randint(w)

            y1 = np.clip(y - self.length // 2, 0, h)
            y2 = np.clip(y + self.length // 2, 0, h)
            x1 = np.clip(x - self.length // 2, 0, w)
            x2 = np.clip(x + self.length // 2, 0, w)

            mask[y1: y2, x1: x2] = 0.

        mask = np.expand_dims(mask, axis=2)
        mask = np.repeat(mask, c, axis=2)
        img = img * mask

        return img
  • Initialization parameters:
    • n_holes (int): The number of areas to be occluded in each image.
    • length (int): The side length of each square area in pixels.
  • call
    • parameter:
      • img: Image array of size (h, w, c).
    • return
      • The image after cutting n_holes square regions with side length length from the image.

2.3 Effect display

img = Image.open('example.jpg').convert('RGB')

# 转换为 NumPy 数组
img = np.array(img)

# 创建 Cutout 实例
cutout = Cutout(3, 64)

# 应用 Cutout 操作
img_cut = cutout(img)

# 将 NumPy 数组转换回 PIL 图像
img_result = Image.fromarray(img_cutout.astype('uint8')).convert('RGB')

# 保存图像
img_result.save('./cutout_image.jpg')

Insert image description here

3. Random Erasing

3.1 Principle

  The Random Erasing operation randomly selects a rectangular area in the image and erases the pixel values ​​in that area, replacing them with random values. This operation simulates the situation where the image may be partially occluded or damaged in real scenes, thereby improving the model's adaptability to incomplete images.

3.2 Implementation

class RandomErasing(object):
    def __init__(self, region_w, region_h):
        self.region_w = region_w
        self.region_h = region_h

    def __call__(self, img):
        if self.region_w < img.shape[1] and self.region_h < img.shape[0]:
            x1 = random.randint(0, img.shape[1] - self.region_w)
            y1 = random.randint(0, img.shape[0] - self.region_h)

            img[y1:y1+self.region_h, x1:x1+self.region_w, 0] = np.random.randint(0, 255, size=(self.region_h, self.region_w))
            img[y1:y1+self.region_h, x1:x1+self.region_w, 1] = np.random.randint(0, 255, size=(self.region_h, self.region_w))
            img[y1:y1+self.region_h, x1:x1+self.region_w, 2] = np.random.randint(0, 255, size=(self.region_h, self.region_w))

        return img
  • initialization:
    • region_w: Width of erased region
    • region_h: height of erased region
  • call
    • parameter:
      • img: image array of size (h, w, c)
    • Check if the width and height of the erased area is less than the width and height of the image
      • Randomly select the coordinates of the upper left corner of the erased area ( x 1 , y 1 ) (x_1, y_1) (x1,and1)
      • Generate random pixel values ​​and apply them to the erased area of ​​the image
    • return
      • Image after random erasure

3.3 Effect display

img = Image.open('example.jpg').convert('RGB')
img = np.array(img)

# 创建 Random Erasing 实例
random_erasing = RandomErasing(region_w=150, region_h=200)

# 应用 Random Erasing 操作
img_erasing = random_erasing(img)

img_result = Image.fromarray(img_erasing.astype('uint8')).convert('RGB')
img_result.save('./erasing_image.jpg')

Insert image description here

4. Mixup

4.1 Principle

  Mixup selects two images and linearly mixes them according to a certain ratio to obtain a new image. By introducing mixing between samples, the diversity of the training set is increased, helping the model to better adapt to different inputs.

4.2 Implementation

class Mixup(object):
    def __init__(self, alpha):
        self.alpha = alpha
        self.lam = np.random.beta(self.alpha, self.alpha)

    def __call__(self, img1, img2):
        img = self.lam * img1 + (1 - self.lam) * img2
        return img
  • Initialization parameters:
    • alpha: blending parameter
    • lam: Generate a random value using the Beta distribution
  • call
    • parameter:
      • img1, img2: Image arrays of size (h, w, c).
    • Linearly blend two images using a blending ratio

4.3 Effect display

  Apply the Mixup operation to the two images below
Insert image description here
Insert image description here

# 读取两张图像
img1 = Image.open('example2.jpg').convert('RGB')
img2 = Image.open('example3.jpg').convert('RGB')

# 调整图像大小
img1 = img1.resize((1920, 1080), Image.Resampling.BICUBIC)
img2 = img2.resize((1920, 1080), Image.Resampling.BICUBIC)

# 转换为 NumPy 数组
img1 = np.array(img1)
img2 = np.array(img2)

# 创建 Mixup 实例
mixup = Mixup(0.6)

# 应用 Mixup 操作
img_mixup = mixup(img1, img2)

# 将 NumPy 数组转换回 PIL 图像
img_result = Image.fromarray(img_mixup.astype('uint8')).convert('RGB')

# 保存图像
img_result.save('./mixup_image.jpg')

Insert image description here

Guess you like

Origin blog.csdn.net/m0_63834988/article/details/134710600
Recommended