Tensorflow2 common data enhancement methods and their implementation summary

In the model building in the CV direction, we often need to enhance the data of the input images, which will reduce the model's over-fitting to the data and improve the performance of the model. In the actual project. For scenarios such as industrial defects and medical images, the data we obtain is limited after all. It is very useful to improve the performance of the model through data enhancement. At this time, only the method of image enhancement can be used to establish the data set required for training.


table of Contents

1. Supervised data enhancement

1.1, single sample data enhancement

1.1.1, geometric space transformation

1.1.2, pixel color conversion class

1.2, multi-sample synthesis

1.2.1、 SMOTE

1.2.2、SamplePairing

1.2.3、 mixup

1.3 Actual combat: the realization of data enhancement in tf.data

1.3.1, the use of imgaug library in Tensorflow

1.3.2 Implementation of mixup data enhancement in Tensorflow

2. Unsupervised data enhancement

2.1 、 GAN

2.2 、 AutoAugment


Here we introduce a more powerful data enhancement tool, all enhancement methods you can think of have- imgaug . imgaug is a library for image enhancement in machine learning experiments. It supports a wide range of expansion techniques, you can easily combine them and execute them in a random order or on multiple CPU cores, with a simple and powerful random interface, which can not only expand images, but also expand key points/landmarks, bounding boxes, Heat map and segment map.

 Imgaug installation command:

pip install imgaug

Here we first introduce the commonly used data enhancement methods, and then introduce how to apply these enhancement methods in Tensorflow2.

The specific usage method is as follows:

import cv2
import imgaug.augmenters as iaa

seq = iaa.Sequential([ 
    iaa.Fliplr(0.5), # horizontally flip 50% of the images
])

img=cv2.imread("D:\\PycharmProjects\\my_project\\GHIM-20\\GHIM-20\\19\\19_9597.jpg")
[img_seq,]=seq(images=[img])
cv2.imshow("test",img_seq)
cv2.waitKey(0)

 Show results:

1. Supervised data enhancement

1.1, single sample data enhancement

1.1.1, geometric space transformation

The geometric transformation class is to perform geometric transformation on the image, including various operations such as flipping, rotating, cropping, deforming, and scaling.

  • Flip

Flip includes horizontal flip and vertical flip.

iaa.Fliplr(0.5) # 左右翻转
iaa.Flipud(1) #上下翻转
  •  Cut

Crop the region of interest (ROI) of the picture, usually during training, a random cropping method is used.

Cutting and padding

iaa.CropAndPad(px=None,
               percent=None,
               pad_mode='constant',
               pad_cval=0,
               keep_size=True,
               sample_independently=True,
               name=None,
               deterministic=False,
               random_state=None)

parameter:

  • px: Want to crop (negative values) or pad (positive values) pixels. Note that percent and percent cannot exist at the same time. If it is None, crop at the pixel level will not be used. int or int list is the same as above. If it is a tuple of 4 elements, then the 4 elements respectively represent (top, right, bottom, left), and each element can be an int or an int tuple or an int list.
  • percent: crop or pad according to the proportion, same as px. But the two cannot exist at the same time.
  • pad_mode: padding mode. Can be All, string, string list. Optional padding ways:  constant, edge, linear_ramp, maximum, median, minimum, reflect, symmetric, wrap. The specific meaning can be found in the numpy documentation.
  • pad_cval: . When pad_mode=constant, select the padding value. float、int、float tuple、int tuple、float list、int list
  • keep_size: bool type. After cropping, the image size will change. If the value is set to 1, it will be scaled to the original size after crop or pad.
  • sample_independently: bool type. If set to False, the value selected from px or percent will be applied to the four directions each time.

Cut

iaa.Crop((0,0,100,100))

  •  Affine

Simultaneously perform multiple operations such as cropping, rotation, conversion, and mode adjustment on the picture. For example, here we can rotate the image between -30 and 30 degrees.

iaa.Affine(rotate=(-30, 30))

  •  Zoom distortion

Randomly select a part of the image, and then scale it to the original image size. For example, adjust the size of each image to height=100 and width=100:

iaa.Resize({"height": 100, "width": 100})

1.1.2, pixel color conversion class

  • Noise

Random noise is based on the original picture, and some noise is randomly superimposed.

#高斯噪声(又称白噪声)。
#将高斯噪声添加到图像中,每个像素从正态分布N(0,s)采样一次,其中s对每个图像采样并在0和0.05 * 255之#间变化:
iaa.AdditiveGaussianNoise(scale=(0, 0.05 * 255))

#矩形丢弃增强器
#在面积大小可选定、位置随机的矩形区域上丢失信息实现转换,所有通道的信息丢失产生黑色矩形块,部分通道
#的信息丢失产生彩色噪声。
#通过将所有像素转换为黑色像素来丢弃2%,但是在具有原始大小的50%的图像的较低分辨率版本上执行此操作,#丢弃2x2的正方形:
iaa.CoarseDropout(0.02, size_percent=0.5)
#丢弃2%的图像,但是在具有原始大小的50%的图像的较低分辨率版本上执行此操作,丢弃2x2个正方形。 此
#外,在50%的图像通道中执行此操作,以便仅将某些通道的信息设置为0,而其他通道保持不变:
iaa.CoarseDropout(0.02, size_percent=0.5, per_channel=0.5)

  • Fuzzy class

Reduce the difference in the value of each pixel to achieve blurring of the picture and smooth the pixels.

#使用高斯内核模糊图像的增强器。用高斯内核模糊每个图像,sigma为3.0:
iaa.GaussianBlur(sigma=(0.0, 3.0))

#像素位移增强器
#通过使用位移字段在本地移动像素来转换图像。
#通过在强度为0.25的失真场后移动单个像素来局部扭曲图像。 每个像素的移动强度范围为0到5.0:
iaa.ElasticTransformation(alpha=(0, 5.0), sigma=0.25)

  • HSV contrast conversion

By adding or subtracting the V value to each pixel in the HSV space, the hue and saturation are modified to achieve contrast conversion.

iaa.WithColorspace(
        to_colorspace="HSV",
        from_colorspace="RGB",
        children=iaa.WithChannels(0, iaa.Add((10, 50)))

  • RGB color perturbation

Convert the picture from the RGB color space to another color space, increase or decrease the color parameters and then return to the RGB color space.

#每个图像转换成一个彩色空间与亮度相关的信道,提取该频道,之间加-30和30并转换回原始的色彩空间。
iaa.AddToBrightness((-30, 30))

#从离散的均匀范围中采样随机值[-50..50],将其转换为角度表示形式,并将其添加到色相(即色空间中的H通
#道)中HSV。
iaa.AddToHue((-50, 50))

#通过随机值增加或减少色相和饱和度。
#增强器首先将图像转换为HSV色彩空间,然后向H和S通道添加随机值,然后转换回RGB。
#在色相和饱和度之间添加随机值(对于每个通道独立添加-100,100对于该通道内的所有像素均添加相同的
#值)。
iaa.AddToHueAndSaturation((-100, 100), per_channel=True)

#将随机值添加到图像的饱和度。
#增强器首先将图像转换为HSV色彩空间,然后将随机值添加到S通道,然后转换回RGB。
#如果要同时更改色相和饱和度,建议使用AddToHueAndSaturation,否则图像将被两次转换为HSV并返RGB。
#从离散的均匀范围内采样随机值[-50..50],并将其添加到饱和度,即添加到 色彩空间中的S通道HSV。
iaa.AddToSaturation((-50, 50))

  • Superpixels

Generate several super-pixels of the image at the maximum resolution, adjust them to the original size, and replace all super-pixel areas in the original image with super-pixels in a certain proportion, and other areas remain unchanged.

#完全或部分地将图像变换为其超像素表示。
#每个图像生成大约64个超像素。 用平均像素颜色替换每个概率为50%。
iaa.Superpixels(p_replace=0.5, n_segments=64)

  • Sharpen and emboss

Perform a certain degree of sharpening or embossing on the image, and merge the result with the image through a certain channel.

#锐化图像,然后使用0.0到1.0之间的alpha将结果与原始图像重叠:
iaa.Sharpen(alpha=(0.0, 1.0), lightness=(0.75, 2.0))

#浮雕图像,然后使用0.0到1.0之间的alpha将结果与原始图像叠加:
iaa.Emboss(alpha=(0.0, 1.0), strength=(0.5, 1.5))

1.2, multi-sample synthesis

1.2.1、 SMOTE

SMOTE, Synthetic Minority Over-sampling Technique, handles the problem of sample imbalance by artificially synthesizing new samples and improves the performance of the classifier.

SMOTE principle:

  1. Select a few samples

  2. For each small sample class sample (x, y), find K nearest neighbor samples according to Euclidean distance

  3. A sample point is randomly selected from K nearest neighbor samples, assuming that the selected nearest neighbor point is (xn, yn). A point is randomly selected as a new sample point on the connecting segment between the small sample class sample (x, y) and the nearest neighbor sample point (xn, yn), which satisfies the following formula:(x_{new},y_{new})=(x,y)+rand(0-1)((x_{n}-x),(y_{n}-y))

  4. Repeat sampling until the number of large and small samples is balanced.

In python, the SMOTE algorithm has been encapsulated in the imbalanced-learn library. The following figure shows an example of data enhancement implemented by the algorithm. The left figure is the original data feature space diagram, and the right figure is the feature space diagram processed by the SMOTE algorithm.

Reference link:

SMOTE__Simple Principle Diagram_Algorithm Implementation and Simple Implementation of R and Python Tuning Package

1.2.2、SamplePairing

The processing flow of the SamplePairing method is shown in the following figure. Two images are randomly selected from the training set and processed by basic data enhancement operations (such as random flip, etc.), and then superimposed into a new sample in the form of pixel averaging, the label is the original sample One of the labels.

After SamplePairing is processed, the size of the training set can be expanded from N to N*N, and the processing can also be completed on the CPU. The training process is a combination of alternate disabling and processing operations using SamplePairing:

  1. Use traditional data to enhance the training network, without using SamplePairing data to enhance training;
  2. After completing one epoch on the ILSVRC data set or 100 epochs on other data sets, join the SamplePairing data enhancement training;
  3. SamplePairing is disabled intermittently. For the ILSVRC dataset, enable SamplePairing for 300,000 images in it, and then disable it for the next 100,000 images. For other data sets, enable it in the first 8 epochs and disable it in the next 2 epochs;
  4. After the training loss function and accuracy are stable, fine-tune it and disable SamplePairing.

The experimental results show that because the SamplePairing data enhancement operation may introduce training samples with different labels, the error of using SamplePairing training on each data set is significantly increased, and the verification error of using SamplePairing training in terms of detection error is greatly reduced. Although SamplePairing has a simple idea, the performance improvement effect is considerable, in line with the principle of Occam's razor, unfortunately, the interpretability is not strong, and there is no theoretical support. At present, there are only experiments with picture data, and further experiments and interpretations are needed.

1.2.3、 mixup

Mixup is a data enhancement method based on the principle of neighborhood risk minimization (VRM), which uses linear interpolation to obtain new sample data. Under the principle of neighborhood risk minimization, based on the prior knowledge that linear interpolation of feature vectors will lead to linear interpolation of related targets, a simple and data-independent mixup formula can be derived:

Among them (xn, yn) are the new data generated by interpolation, (xi, yi) and (xj, yj) are two randomly selected data in the training set, the value of λ satisfies the beta distribution, and the value range is from 0 to 1. , The hyperparameter α controls the intensity of interpolation between feature targets.

The experiment of mixup is rich, and the experimental results show that the generalization error of the deep learning model in the ImageNet data set, CIFAR data set, speech data set and table data set can be improved, reducing the model’s memory of damaged labels, and enhancing the model’s robustness to adversarial samples. Robustness and training against the stability of the generative network. The mixup process realizes the blurring of the boundary, provides a smooth prediction effect, and enhances the prediction ability of the model outside the range of the training data. As the hyperparameter α increases, the training error of the actual data will increase, and the generalization error will decrease. It shows that mixup implicitly controls the complexity of the model. As the model capacity and hyperparameters increase, the training error decreases. Despite the considerable effect improvement, mixup has not yet had a good explanation for the bias-variance balance. In other types of supervised learning, unsupervised, semi-supervised and reinforcement learning, mixup still has a lot of room for development.

Summary: Mixup, SMOTE, and SamplePairing have similarities in their thinking. They all try to continuum discrete sample points to fit the real sample distribution, but the added sample points are still located in the feature space where the small sample points are known. Within the enclosed area. However, in the feature space, the true distribution of small sample data may not be limited to this area. Proper interpolation outside the given range may achieve better data enhancement effects.

Reference connection:

Deep learning | training network trick-mixup

1.3 Actual combat: the realization of data enhancement in tf.data

1.3.1, the use of imgaug library in Tensorflow

import imgaug.augmenters as iaa

#添加需要增强的操作
self.augmenter = iaa.Sequential([
        iaa.Fliplr(config.sometimes),
        iaa.Crop(percent=config.crop_percent),
        ...
        ], random_order=config.random_order)

#封装增强函数
def augment_fn(self):
    def augment(images, labels):
        img_dtype = images.dtype
        img_shape = tf.shape(images)
        images = tf.numpy_function(self.augmenter.augment_images,[images],img_dtype)
        images = tf.reshape(images, shape = img_shape)
        return images, labels
    return augment

1.3.2 Implementation of mixup data enhancement in Tensorflow

 Reference link: https://github.com/OFRIN/Tensorflow_MixUp/blob/master/Train_MixUp.py

MIXUP_ALPHA = 0.2
BATCH_SIZE = 64

# MixUp Function
def MixUp(images, labels):
    indexs = np.random.permutation(BATCH_SIZE)
    alpha = np.random.beta(MIXUP_ALPHA, MIXUP_ALPHA, BATCH_SIZE)

    image_alpha = alpha.reshape((BATCH_SIZE, 1, 1, 1))
    label_alpha = alpha.reshape((BATCH_SIZE, 1))

    x1, x2 = images, images[indexs]
    y1, y2 = labels, labels[indexs]

    images = image_alpha * x1 + (1 - image_alpha) * x2
    labels = label_alpha * y1 + (1 - label_alpha) * y2

    return images, labels

2. Unsupervised data enhancement

Unsupervised data enhancement mainly includes two categories:

  1. Learning the distribution of the data through the model, randomly generating images consistent with the distribution of the training data set, the representative method, GAN.
  2. Through the model, learn a data enhancement method suitable for the current task, the representative method, AutoAugment.

2.1 、 GAN

Generative adversarial networks, translated name generative adversarial networks, it contains two networks, one is a generative network and the other is a confrontation network. The basic principles are as follows:

  1. G is a network that generates pictures. It receives random noise z and generates pictures through noise, denoted as G(z).
  2. D is a discriminant network, which determines whether a picture is "real", that is, whether it is a real picture or a picture generated by G.

2.2 、 AutoAugment

Although this is a paper, it can also be regarded as a research direction. Its basic idea: use reinforcement learning to find the best image transformation strategy from the data itself, and learn different enhancement methods for different tasks.

Fundamental:

  1. Prepare 16 data enhancement operations.
  2. Choose 5 operations from 16 and randomly generate the probability and amplitude of using the operation, call it a sub-policy, and generate a total of 5 sub-polices.
  3. The pictures in each batch randomly use one of the five sub-polices operations.
  4. Feedback through the generalization ability of the childmodel on the validation set, and the use of enhanced learning methods.
  5. After 80~100 epochs, it starts to be effective and can learn sub-policies.
  6. Connect these 5 sub-policies in series, and then proceed to the final training.

Reference link:

[Technical review] What are the data enhancement methods in deep learning?

Python—imgaug image data enhancement, Pythonimgaug, picture

imgaug data enhancement artifact: a list of enhancers

Guess you like

Origin blog.csdn.net/wxplol/article/details/107937508