Tensor FLow image processing

When performing deep learning on images, sometimes the number of pictures may be insufficient, or you want the network to learn more. At this time, you can process the existing picture data to make it a new picture, and then proceed on this basis. Learn to improve the accuracy of network recognition.

1. Image decoding display

The matplot library can be used to conveniently and concisely draw and output pictures in jupyter. First, open the picture file through tf.gfile, and use the function tf.image.decode_jpeg to decode the jpeg picture into a three-bit matrix, and then you can draw and Show picture information

import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

#读取图像文件
image_raw=tf.gfile.GFile('D:\Temp\MachineLearning\data\cat.jpeg','rb').read()

with tf.Session() as sess:
    #对jpeg图像解码得到图像的三位矩阵数据
    image_data=tf.image.decode_jpeg(image_raw)
    print(image_data.eval())
    plt.imshow(image_data.eval())
    plt.show()

You can see the three-dimensional matrix information of the printed pictures and the displayed pictures:

2. Image zoom

Tensorflow also comes with many image processing functions, such as resize_image to scale the size of the picture. The first parameter represents the image data source, the second array represents the scaled size, the third method represents the scaling method used, the default 0 is the bilinear interpolation method, 1 represents the nearest neighbor interpolation method, and 2 represents the bicubic Interpolation method, 3 represents pixel area interpolation method.

    #对图片大小进行缩放
    image_resize=tf.image.resize_images(image_data,[500,500],method=0)
    #tensorflow处理后的图片是float32格式的，需要转化为uint8才能正确输出
    image_resize=np.asarray(image_resize.eval(),dtype='uint8')
    plt.imshow(image_resize)
    plt.show()

3. Image cropping

The function tf.image.resize_image_with_crop_or_pad can crop or fill the picture while ensuring the original proportion of the picture.

The function tf.image.random_crop randomly selects and crops the picture, not the center.

    #图片裁剪
    image_crop=tf.image.resize_image_with_crop_or_pad(image_data,500,500)
    plt.imshow(image_crop.eval())
    plt.show()
    #随机裁剪
    img_random=tf.image.random_crop(image_data,[300,300,3])
    plt.imshow(img_random.eval())
    plt.show()

The first parameter of resize_image_with_crop_or_pad is the image resource. The last two parameters are the size of the cropped image. When the original image is larger than the target value, the extra parts on both sides will be cut off. When the image is smaller than the target value, it will be filled with black, such as the left and right of the above image It is cropped and filled with black up and down.

The first parameter of random_crop is the image resource, and the second parameter is a three-digit tensor, which represents the target image size.

4. Image flip

The function can be used to flip the picture up and down and left and right. During model training, the original sample picture can be inverted and input as a new feature value for model training.

    #上下翻转
    img_down=tf.image.flip_up_down(image_data)
    plt.imshow(img_down.eval())
    plt.show()
    
    #左右翻转
    img_left=tf.image.flip_left_right(image_data)
    plt.imshow(img_left.eval())
    plt.show()

5. Adjust contrast, brightness, saturation

You can adjust the image contrast through tf.image.adjust_contrast. When the parameter is greater than 1, it means deepening, and if it is less than 1, it means lightening.

tf.image.random_contrast can adjust the contrast immediately within the specified range

Similarly, adjust_brightness, adjust_saturation, adjust_hue adjust brightness, saturation, hue

    #加深对比度
    img_deep=tf.image.adjust_contrast(image_data,2)
    plt.imshow(img_deep.eval())
    plt.show()
    #降低对比度
    img_fade=tf.image.adjust_contrast(image_data,0.5)
    plt.imshow(img_fade.eval())
    plt.show()
    #随机对比度
    img_contrast=tf.image.random_contrast(image_data,0.5,2)
    plt.imshow(img_contrast.eval())
    plt.show()

6. Process the input pictures of the VGG network

The picture parameter x_img passed in the Vgg network training is four-dimensional data in units of batch_size. For example, 20 32 × 32 3-channel pictures are passed in, and the data is [20,32,32,3]. But tensorflow's image processing function can only process three-dimensional single images. Therefore, you need to split the 20 pictures into a single [1,32,32,3] through the split () function, and then convert it into three-dimensional data [32,32,3] through the reshape () function, and then call the picture processing The function processes the picture, restores the processed picture to four dimensions, and then puts it in the array res_arr, splicing it into the original set of 20 × 32 × 32 × 3 data.

# 将一批batch_size张图片在第一维上切分为单张图片
img_arr=tf.split(x_img,batch_size,axis=0)
res_arr=[]
# 遍历每个图片对其进行处理
for img in img_arr:
    # 将单张四维的图片[1,32,32,3]处理成三维[32,32,3]
    img=tf.reshape(img,[32,32,3])
    # 对单张图片进行图像增强
    img_flip=tf.image.random_flip_left_right(img)     # 翻转图片
    img_bright=tf.image.random_brightness(img_flip,max_delta=63)    # 随机调整亮度
    img_contrast=tf.image.random_contrast(img_bright,lower=0.2, upper=1.8)  # 调整对比度
    # 将增强后的图片再变回原来的四维格式
    img=tf.reshape(img_contrast,[1,32,32,3])
    # 将每个处理后的图片放在一个数组
    res_arr.append(img)
# 将处理后的单个图片重新拼接在一起    
img_aug=tf.concat(res_arr,axis=0)

theVicTory

Published 124 original articles · Like 65 · Visit 130,000+

Private letter concerns