机器学习中图像增强的方法

在训练模型时，数据是很重要的一部分。在这个数据为王的年代，要想训练出优秀的模型，数据是不可或缺的一部分。但是很多。但是大多数情况下，数据的获取并不那么容易，所以这时候就可以采取图像增强的方法，可以在一定程度上缓解过拟合的问题。（但这个也只是在某种程度上可以缓解，如果数据量属实太小，效果也不太明显）

话不多说，现在正式介绍相关的方法。

1.随机调整图片的饱和度，对比度，以及色调（但也要符合实际的情况，调整范围不易过大）

2.随机对图片进行裁剪，翻转，旋转变化（同样的，也要符合实际情况）

方法2还是很吃显卡的，可能会跑不起来

3.在图像中随机添加高斯噪声(在数据量充足的情况下，一般也会使用该方法进行增强)

4. PCA Jittering，最早是由Alex在他2012年赢得ImageNet竞赛的那篇NIPS中提出来的. 我们首先按照RGB三个颜色通道计算了均值和标准差，对网络的输入数据进行规范化，随后我们在整个训练集上计算了协方差矩阵，进行特征分解，得到特征向量和特征值，用来做PCA Jittering。

5.Crop Sampling，就是怎么从原始图像中进行缩放裁剪获得网络的输入。比较常用的有2种方法：一是使用Scale Jittering，VGG和ResNet模型的训练都用了这种方法。二是尺度和长宽比增强变换，最早是Google提出来训练他们的Inception网络的。

6.有监督的Crop,在Bolei今年CVPR文章的启发下，提出了有监督的数据增强方法。我们首先按照通常方法训练一个模型，然后用这个模型去生成真值标签的Class Activation Map（或者说Heat Map）, 这个Map指示了目标物体出现在不同位置的概率. 我们依据这个概率，在Map上随机选择一个位置，然后映射回原图，在原图那个位置附近去做 Crop。

以下是 Tensorflow 实战 Google 深度学习框架的提到的一些图像数据增强的方法

import matplotlib.pyplot as plt  
  
def distort_color(image, color_ordering=0):  
    if color_ordering == 0:  
        image = tf.image.random_brightness(image, max_delta=32. / 255.)#亮度  
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)#饱和度  
        image = tf.image.random_hue(image, max_delta=0.2)#色相  
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)#对比度  
    if color_ordering == 1:  
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)  
        image = tf.image.random_hue(image, max_delta=0.2)  
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)  
        image = tf.image.random_brightness(image, max_delta=32. / 255.)  
    if color_ordering == 2:  
        image = tf.image.random_hue(image, max_delta=0.2)  
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)  
        image = tf.image.random_brightness(image, max_delta=32. / 255.)  
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)  
    if color_ordering == 3:  
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)  
        image = tf.image.random_brightness(image, max_delta=32. / 255.)  
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)  
        image = tf.image.random_hue(image, max_delta=0.2)  
    return tf.clip_by_value(image, 0.0, 1.0)  
  
def preprocess_for_train(image, height, width, bbox):  #图像的翻转  
    if bbox is None:  
        bbox = tf.constant([0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])  
    if image.dytpe != tf.float32:  
        image = tf.image.convert_image_dtype(image, dtype=tf.float32)  
    bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(tf.shape(image), bounding_boxes=bbox)  
    distorted_image = tf.slice(image, bbox_begin, bbox_size)  
    distorted_image = tf.image.resize_images(distorted_image, height, width, method=np.random.randint(4))  
    distorted_image = tf.image.random_flip_left_right(distorted_image)  
    distorted_image = distort_color(distorted_image, np.random.randint(4))  
    return distorted_image  
  
image_raw_data = tf.gfile.FastGFile("").read()  
with tf.Session() as sess:  
    img_data = tf.image.decode_jpeg(image_raw_data)  
    boxes = tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])  
    result = preprocess_for_train(img_data, 299, 299, boxes)

机器学习中图像增强的方法

猜你喜欢