[Semantic Segmentation] Data enhancement method (simultaneous amplification of original image and label)

1. Data enhancement

   avoid overfitting

   Improve the robustness of the model

  Improve the generalization ability of the model

  Avoid the problem of unbalanced samples

2. Data augmentation classification

It can be divided into two categories: online enhancement and offline enhancement. The difference between the two is that offline enhancement is to process the data set before training, and often can get multiple data sets. Online enhancement is to preprocess the loaded data during training without changing the amount of training data.

Offline enhancement is generally used for small datasets when training data is insufficient, and online enhancement is generally used for large datasets.

3. Method

The more commonly used geometric transformation methods mainly include: flipping, rotating, cropping, scaling, translation transformation, color dithering, scale transformation, contrast transformation, noise perturbation, and rotation transformation;

The more commonly used pixel transformation methods are: adding salt and pepper noise, Gaussian noise, Gaussian blur, adjusting HSV contrast, adjusting brightness, saturation , histogram equalization, adjusting white balance, etc.

Enhanced with Augmentor modules

Note:
The suffixes of the original image and the label image must be consistent, otherwise only the label image will not be enhanced

Because my image is marked by labelme and converted to voc format, the original image after conversion is jpg, and the original image is png, because it needs to be unified. The unified method is as follows: modify the image suffix in batches .

1. Installation:
Create an environment, and then enter the installation command, the command is as follows

pip install Augmentor

conda install Augmentor

If the installation is successful, you can continue.

 2. Use:

The semantic segmentation task needs to enhance the original image and the mask image (mask) at the same time. Therefore, the image enhancement tools that come with many existing deep learning frameworks cannot be used directly. But this function can be easily realized through Augmentor. The following example illustrates. Put the original images and their corresponding mask images in the test1 folder and test2 folder respectively. Enhanced with the following code

The original image

label map

#导入数据增强工具
import Augmentor

#确定原始图像存储路径以及标签图的文件存储路径,创建Pipeline实例p
p = Augmentor.Pipeline("originalImages")
p.ground_truth("Segmentationimages")

(1) Rotate

Probability specifies the probability of operation, max_left_rotation, max_right_rotation specifies the maximum rotation angle from left to right, and the maximum value is 25 . sample means to generate a specified number of enhanced images from a given image, and more than one can be specified.

The rotate operation defaults to cropping after rotating the original image, and outputs an enhanced image of the same size as the original image.

p.rotate(probability=1, max_left_rotation=25, max_right_rotation=25)
p.sample(1)

(2) Zoom (scale), but it seems that it can only zoom in proportionally

scale_factor represents the zoom ratio, which can only be greater than 1, and it is a proportional zoom.

p.scale(probability=1, scale_factor=1.3)

(3) flip (flip)

Flip left and right, flip up and down, random flip

p.flip_random(probability=1)   %随机翻转
p.flip_left_right(probability=0.5)   %左右翻转
p.flip_top_bottom(probability=0.5)    %上下翻转

(4) Random brightness enhancement/weakening (random_brightness)

min_factor, max_factor are change factors, which determine the degree of brightness change and can be specified according to the effect.

p.random_brightness(probability=1, min_factor=0.7, max_factor=1.2)   %随机亮度
p.random_color(probability=1, min_factor=0.0, max_factor=1)   %随机颜色
p.random_contrast(probability=1, min_factor=0.7, max_factor=1.2)   %随机对比度

(5) Random perspective deformation (skew)

magnitude indicates the degree of deformation. Hidden parameter skew_type, the value is ``TILT``, ``TILT_TOP_BOTTOM``, ``TILT_LEFT_RIGHT``, ``CORNER``, you can only see it when you expand the source code. The source code uses a random method to select from four parameters without specifying.

Among them, ``TILT_TOP_BOTTOM`` indicates that the perspective deformation is only performed in the top and bottom directions.

``TILT_LEFT_RIGHT`` indicates that the perspective distortion is only performed in the left and right directions.

``CORNER`` means that the perspective deformation is only performed in the direction of the four corners.

``TILT`` contains a collection of the above directions, that is, eight directions of up, down, left, right and four corners.
 

p.skew(probability=1, magnitude=0.8)

(6) random shear (shear)

Shear transformation, max_shear_left, max_shear_right are shear transformation angles

p.shear(probability=1, max_shear_left=15, max_shear_right=15)

(7) Random cropping (random_crop)

percentage_area indicates the ratio of the cropping area to the original image area, center specifies whether to crop from the middle of the image, and randomise_percentage_area specifies whether to randomly generate the cropping area ratio.

p.crop_random(probability=1, percentage_area=0.8, centre=False, randomise_percentage_area=True)

(8) Random erasing/occlusion (random_erasing)

rectangle_area specifies the percentage of the randomly erased area. Of course, this specifies the upper limit of the erasing area.

p.random_erasing(probability=1, rectangle_area=0.5)

(9) small deformation deformation

p.random_distortion(probability=0.8,grid_width=10,grid_height=10, magnitude=20)

Full code:

import Augmentor


# 确定原始图像存储路径以及掩码文件存储路径,需要把“\”改成“/”
p = Augmentor.Pipeline("originalImages")
p.ground_truth("Segmentationimages")

# 图像旋转: 按照概率0.8执行,范围在0-25之间
p.rotate(probability=0.8, max_left_rotation=25, max_right_rotation=25)

# 图像左右互换: 按照概率0.5执行
p.flip_left_right(probability=0.5)
p.flip_top_bottom(probability=0.5)

# 图像放大缩小: 按照概率0.8执行,面积为原始图0.85倍
p.zoom_random(probability=0.3, percentage_area=0.85)

#scale_factor表示缩放比例,只能大于1,且为等比放大。
p.scale(probability=1, scale_factor=1.3)

#小块变形
p.random_distortion(probability=0.8,grid_width=10,grid_height=10, magnitude=20)

#随机亮度增强/减弱,min_factor, max_factor为变化因子,决定亮度变化的程度,可根据效果指定
p.random_brightness(probability=1, min_factor=0.7, max_factor=1.2)

#随机颜色/对比度增强/减弱
#p.random_color(probability=1, min_factor=0.0, max_factor=1)
p.random_contrast(probability=1, min_factor=0.7, max_factor=1.2)

#随机剪切(shear)  max_shear_left,max_shear_right为剪切变换角度  范围0-25
p.shear(probability=1, max_shear_left=10, max_shear_right=10)

#随机裁剪(random_crop)
p.crop_random(probability=1, percentage_area=0.8, randomise_percentage_area=True)

#随机翻转(flip_random)
p.flip_random(probability=1)

# 最终扩充的数据样本数可以更换为100。1000等
p.sample(10)  

An out result will be automatically generated, the effect is as follows:

 Then you can separate it manually.

 

Guess you like

Origin blog.csdn.net/weixin_45912366/article/details/127855494