12. TensorFlow image processing

1. Image encoding and decoding

When the image is stored , it does not directly record the numbers in these matrices, but records the result after compression encoding . Therefore, to restore an image into a three-dimensional matrix, a decoding process is required. And in imreadOpenCV imwriteis a decoding and encoding process. Corresponding encoding and decoding functions are provided in TensorFlow .

# 图像解码函数
tf.image.decode_image(
    contents,
    channels=None,
    name=None
)

# 参数
contents: 0-D string. The encoded image bytes.
channels: An optional int. Defaults to 0. Number of color channels for the decoded image.

# 返回值
Tensor with type uint8 with shape [height, width, num_channels] for BMP, JPEG, and PNG images and shape [num_frames, height, width, 3] for GIF images. 


# 图像编码函数
tf.image.encode_jpeg()
tf.image.encode_png()

2. Image resizing

# 1、缩放
tf.image.resize_images(
    images,
    size,
    method=ResizeMethod.BILINEAR,
    align_corners=False
)

# 参数
images: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].

size: A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images.

method can be one of:
ResizeMethod.BILINEAR: 双线性插值法,默认
ResizeMethod.NEAREST_NEIGHBOR: 最近邻法
ResizeMethod.BICUBIC: 双三线性插值法
ResizeMethod.AREA: 面积插值法

# 返回值(float)
If images was 4-D, a 4-D float Tensor of shape [batch, new_height, new_width, channels]. If images was 3-D, a 3-D float Tensor of shape [new_height, new_width, channels].



# 2、裁剪(居中)或补零(四周均匀)
tf.image.resize_image_with_crop_or_pad(
    image,
    target_height,
    target_width
)

# 参数
image: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].

# 返回值
Cropped and/or padded image. If images was 4-D, a 4-D float Tensor of shape [batch, new_height, new_width, channels]. If images was 3-D, a 3-D float Tensor of shape [new_height, new_width, channels]



# 3、按比例居中裁剪
tf.image.central_crop(
    image,
    central_fraction
)



# 4、对输入图像做剪裁并通过插值方法调整尺寸
tf.image.crop_and_resizecrop_and_resize(
    image,
    boxes,
    box_ind,
    crop_size,
    method='bilinear',
    extrapolation_value=0,
    name=None
)



# 5、沿着给定的 bbox 坐标进行裁剪
tf.image.crop_to_bounding_box(
    image,
    offset_height,
    offset_width,
    target_height,
    target_width
)

# 参数
image: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].

bbox: the top-left corner of the returned image is at offset_height, offset_width in image, and its lower-right corner is at offset_height + target_height, offset_width + target_width.

# 返回值
If image was 4-D, a 4-D float Tensor of shape [batch, target_height, target_width, channels] If image was 3-D, a 3-D float Tensor of shape [target_height, target_width, channels]



# 6、沿着原图像补零到指定高度(target_height)和宽度(target_width)
tf.image.pad_to_bounding_boxpad_to_bounding_box(
    image,
    offset_height,
    offset_width,
    target_height,
    target_width
)

# 工作原理
Adds offset_height rows of zeros on top, offset_width columns of zeros on the left, and then pads the image on the bottom and right with zeros until it has dimensions target_height, target_width.

# 参数
image: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].

offset_height: Number of rows of zeros to add on top.
offset_width: Number of columns of zeros to add on the left.

target_height: Height of output image.
target_width: Width of output image.

# 返回值
If image was 4-D, a 4-D float Tensor of shape [batch, target_height, target_width, channels] If image was 3-D, a 3-D float Tensor of shape [target_height, target_width, channels]

3. Image flip and rotation

# 1、(随机)上下翻转
tf.image.flip_up_down(image)
tf.image.random_flip_up_down(image,seed=None)


# 2、(随机)左右翻转
tf.image.flip_left_right(image)
tf.image.random_flip_left_right(image,seed=None)


# 3、沿对角线翻转:交换图像的第一维和第二维
tf.image.transpose_image(image)
# 参数
image: 3-D tensor of shape [height, width, channels]

# 返回值
A 3-D tensor of the same type and shape as image


# 4、将图像逆时针旋转 90*k 度
tf.image.rot90(image, k=1)
# 参数
image: A 3-D tensor of shape [height, width, channels].
k: A scalar integer. The number of times the image is rotated by 90 degrees.
name: A name for this operation (optional).

# 返回值
A rotated 3-D tensor of the same type and shape as image.


# 5、Rotate image(s) by the passed angle(s) in radians(弧度)
tf.contrib.image.rotate(
    images,
    angles,
    interpolation='NEAREST'
)
# 参数
images: A tensor of shape (num_images, num_rows, num_columns, num_channels) (NHWC), (num_rows, num_columns, num_channels) (HWC), or (num_rows, num_columns) (HW).

angles: A scalar angle to rotate all images by, or (if images has rank 4) a vector of length num_images, with an angle for each image in the batch.

interpolation: Interpolation mode. Supported values: "NEAREST", "BILINEAR".

# 返回值
Image(s) with the same type and shape as images, rotated by the given angle(s). Empty space due to the rotation will be filled with zeros. 

Fourth, the image color adjustment

# 1、调整 RGB 图像或灰度图的亮度
# delta is the amount to add to the pixel values, should be in [0,1)
tf.image.adjust_brightness(
    image,
    delta
)


# 2、调整 RGB 图像的色相, delta must be in the interval [-1, 1]
tf.image.adjust_hue(
    image,
    delta,
    name=None
)


# 3、调整 RGB 图像或灰度图的对比度
tf.image.adjust_contrast(
    images,
    contrast_factor
)


# 4、调整 RGB 图像的饱和度
tf.image.adjust_saturation(
    image,
    saturation_factor,
    name=None
)


# 5、在输入图像上执行伽马校正
tf.image.adjust_gamma(
    image,
    gamma=1,
    gain=1
)


# 6、在[-max_delta, max_delta]的范围内随机调整图像的亮度,0 的时候就是原始图像
tf.image.random_brightness(
    image,
    max_delta,
    seed=None
)


# 7、在[-max_delta, max_delta]的范围内随机调整图像的色相
# max_delta must be in the interval [0, 0.5]
tf.image.random_hue(
    image,
    max_delta,
    seed=None
)


# 8、在[lower, upper] 的范围随机调整图像的对比度
tf.image.random_contrast(
    image,
    lower,
    upper,
    seed=None
)


# 9、在[lower, upper] 的范围随机调整图像的饱和度
tf.image.random_saturation(
    image,
    lower,
    upper,
    seed=None
)

# 10、图像色彩空间转换
tf.image.rgb_to_grayscale()
tf.image.grayscale_to_rgb()
tf.image.hsv_to_rgb()
tf.image.rgb_to_hsv()  # 必须先转换为实数(float32)图像


# 11、图像数据类型转换,eg: 转成 uint8-->float32, 除 255 转成 [0,1)
tf.image.convert_image_dtype(
    image,
    dtype,
    saturate=False,
    name=None
)


# 12、图像标准化处理(均值为0,方差为1)
tf.image.per_image_standardization(image)

5. Process the bounding_box

# 1、Draw bounding boxes on a batch of images
draw_bounding_boxes(
    images,
    boxes,
    name=None
)
# 参数
images: A Tensor. Must be one of the following types: float32, half. 4-D with shape [batch, height, width, depth]. A batch of images.

boxes: A Tensor of type float32. 3-D with shape [batch, num_bounding_boxes, 4] containing bounding boxes.

# 返回值
A Tensor. Has the same type as images. 4-D with the same shape as images. The batch of input images with bounding boxes drawn on the images.

# 数据类型和维度注意事项
images 要求为实数,所以需要先将图像矩阵转化为实数类型,并增加一个 batch 维度 1,eg:
batched = tf.expand_dims(
    tf.image.convert_image_dtype(images, tf.float32),
    axis=0
)

# 坐标系顺序和相对坐标注意事项
The coordinates of the each bounding box in boxes are encoded as [y_min, x_min, y_max, x_max]. The bounding box coordinates are floats in [0.0, 1.0] relative to the width and height of the underlying image.

For example, if an image is 100 x 200 pixels and the bounding box is [0.1, 0.2, 0.5, 0.9], the bottom left and upper right coordinates of the bounding box will be (10, 40) to (50, 180).



# 2、非极大值抑制
tf.image.non_max_suppression(
    boxes,
    scores,
    max_output_size,
    iou_threshold=0.5,
    name=None
)



# 3、Generate a single randomly distorted bounding box for an image
tf.image.sample_distorted_bounding_box(
    image_size,
    bounding_boxes,
    seed=None,
    seed2=None,
    min_object_covered=None,
    aspect_ratio_range=None,
    area_range=None,
    max_attempts=None,
    use_image_if_no_bounding_boxes=None,
    name=None
)

6. References

1、https://www.tensorflow.org/api_docs/python/tf/image
2、https://www.tensorflow.org/api_docs/python/tf/contrib/image

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325447228&siteId=291194637