Article directory
1. Experiment introduction
In deep learning tasks, data enhancement is one of the key steps to improve the generalization ability of the model. By transforming and expanding the training set, the amount of data can be effectively increased, differences between samples can be introduced, and the model can better adapt to different inputs.
This experiment will implement custom image data enhancement operations, including Cutout (occlusion), Random Erasing (random erasure) and Mixup (mixing) a>.
2. Experimental environment
1. Configure the virtual environment
conda create -n Image python=3.9
conda activate Image
conda install pillow numpy
2. Library version introduction
software package | This experimental version |
---|---|
numpy | 1.21.5 |
python | 3.9.13 |
pillow | 9.2.0 |
3. Experimental content
0. Import necessary libraries
import numpy as np
from PIL import Image
import random
1. Basic operations of PIL
[Deep Learning Experiment] Image Processing (1): Python Imaging Library (PIL) library: image reading, writing, copying, pasting, geometric transformation, image enhancement, image filtering
[Deep Learning Experiment] Image Processing (2): Image processing and random image enhancement in PIL and PyTorch (transforms)
2. Cutout
2.1 Principle
The Cutout operation is to randomly select one or more square areas on the image and set the pixel values of these areas to zero to achieve the effect of occlusion. This operation helps the model to be robust to the absence of certain areas, allowing the model to pay more attention to other parts of the image.
2.2 Implementation
class Cutout(object):
def __init__(self, n_holes, length):
self.n_holes = n_holes
self.length = length
def __call__(self, img):
h, w, c = img.shape
mask = np.ones((h, w), np.float32)
for _ in range(self.n_holes):
y = np.random.randint(h)
x = np.random.randint(w)
y1 = np.clip(y - self.length // 2, 0, h)
y2 = np.clip(y + self.length // 2, 0, h)
x1 = np.clip(x - self.length // 2, 0, w)
x2 = np.clip(x + self.length // 2, 0, w)
mask[y1: y2, x1: x2] = 0.
mask = np.expand_dims(mask, axis=2)
mask = np.repeat(mask, c, axis=2)
img = img * mask
return img
- Initialization parameters:
- n_holes (int): The number of areas to be occluded in each image.
- length (int): The side length of each square area in pixels.
- call
- parameter:
- img: Image array of size (h, w, c).
- return
- The image after cutting n_holes square regions with side length length from the image.
- parameter:
2.3 Effect display
img = Image.open('example.jpg').convert('RGB')
# 转换为 NumPy 数组
img = np.array(img)
# 创建 Cutout 实例
cutout = Cutout(3, 64)
# 应用 Cutout 操作
img_cut = cutout(img)
# 将 NumPy 数组转换回 PIL 图像
img_result = Image.fromarray(img_cutout.astype('uint8')).convert('RGB')
# 保存图像
img_result.save('./cutout_image.jpg')
3. Random Erasing
3.1 Principle
The Random Erasing operation randomly selects a rectangular area in the image and erases the pixel values in that area, replacing them with random values. This operation simulates the situation where the image may be partially occluded or damaged in real scenes, thereby improving the model's adaptability to incomplete images.
3.2 Implementation
class RandomErasing(object):
def __init__(self, region_w, region_h):
self.region_w = region_w
self.region_h = region_h
def __call__(self, img):
if self.region_w < img.shape[1] and self.region_h < img.shape[0]:
x1 = random.randint(0, img.shape[1] - self.region_w)
y1 = random.randint(0, img.shape[0] - self.region_h)
img[y1:y1+self.region_h, x1:x1+self.region_w, 0] = np.random.randint(0, 255, size=(self.region_h, self.region_w))
img[y1:y1+self.region_h, x1:x1+self.region_w, 1] = np.random.randint(0, 255, size=(self.region_h, self.region_w))
img[y1:y1+self.region_h, x1:x1+self.region_w, 2] = np.random.randint(0, 255, size=(self.region_h, self.region_w))
return img
- initialization:
- region_w: Width of erased region
- region_h: height of erased region
- call
- parameter:
- img: image array of size (h, w, c)
- Check if the width and height of the erased area is less than the width and height of the image
- Randomly select the coordinates of the upper left corner of the erased area ( x 1 , y 1 ) (x_1, y_1) (x1,and1)
- Generate random pixel values and apply them to the erased area of the image
- return
- Image after random erasure
- parameter:
3.3 Effect display
img = Image.open('example.jpg').convert('RGB')
img = np.array(img)
# 创建 Random Erasing 实例
random_erasing = RandomErasing(region_w=150, region_h=200)
# 应用 Random Erasing 操作
img_erasing = random_erasing(img)
img_result = Image.fromarray(img_erasing.astype('uint8')).convert('RGB')
img_result.save('./erasing_image.jpg')
4. Mixup
4.1 Principle
Mixup selects two images and linearly mixes them according to a certain ratio to obtain a new image. By introducing mixing between samples, the diversity of the training set is increased, helping the model to better adapt to different inputs.
4.2 Implementation
class Mixup(object):
def __init__(self, alpha):
self.alpha = alpha
self.lam = np.random.beta(self.alpha, self.alpha)
def __call__(self, img1, img2):
img = self.lam * img1 + (1 - self.lam) * img2
return img
- Initialization parameters:
- alpha: blending parameter
- lam: Generate a random value using the Beta distribution
- call
- parameter:
- img1, img2: Image arrays of size (h, w, c).
- Linearly blend two images using a blending ratio
- parameter:
4.3 Effect display
Apply the Mixup operation to the two images below
# 读取两张图像
img1 = Image.open('example2.jpg').convert('RGB')
img2 = Image.open('example3.jpg').convert('RGB')
# 调整图像大小
img1 = img1.resize((1920, 1080), Image.Resampling.BICUBIC)
img2 = img2.resize((1920, 1080), Image.Resampling.BICUBIC)
# 转换为 NumPy 数组
img1 = np.array(img1)
img2 = np.array(img2)
# 创建 Mixup 实例
mixup = Mixup(0.6)
# 应用 Mixup 操作
img_mixup = mixup(img1, img2)
# 将 NumPy 数组转换回 PIL 图像
img_result = Image.fromarray(img_mixup.astype('uint8')).convert('RGB')
# 保存图像
img_result.save('./mixup_image.jpg')