Summary of image data enhancement techniques

        The goal of training a machine learning or deep learning model is to be a "universal" model. This requires the model not to overfit the training dataset, or in other words, our model has a good understanding of unseen data. Data augmentation is also one of the many ways to avoid overfitting.

        The process of expanding the amount of data used to train a model is called data augmentation. By training a model with multiple data types, we can obtain a more "generalizable" model. What does "multiple data types" mean? This article only discusses "image" data enhancement technology, and only introduces various image data enhancement strategies in detail. We will also use PyTorch to get hands-on and implement image data or data augmentation techniques mainly used in computer vision.

Because the introduction is data enhancement technology. So just use one picture, let's take a look at the code of the video

 import PIL.Image as Image  
 import torch  
 from torchvision import transforms  
 import matplotlib.pyplot as plt  
 import numpy as np  
 import warnings  

 def imshow\(img\_path, transform\):  
  """  
  Function to show data augmentation  
  Param img\_path: path of the image  
  Param transform: data augmentation technique to apply  
  """  
  img = Image.open\(img\_path\)  
  fig, ax = plt.subplots\(1, 2, figsize=\(15, 4\)\)  
  ax\[0\].set\_title\(f'Original image \{img.size\}'\)  
  ax\[0\].imshow\(img\)  
  img = transform\(img\)  
  ax\[1\].set\_title\(f'Transformed image \{img.size\}'\)  
  ax\[1\].imshow\(img\)

Resize/Rescale

This function is used to adjust the height and width of the image to the specific size we want. The code below demonstrates that we want to resize an image from its original size to 224 x 224.

 path = './kitten.jpeg'  
 transform = transforms.Resize\(\(224, 224\)\)  
 imshow\(path, transform\)

Cropping

This technique applies a portion of the image to be selected to a new image. For example, use CenterCrop to return a center cropped image.

 transform = transforms.CenterCrop\(\(224, 224\)\)  
 imshow\(path, transform\)

RandomResizedCrop

This approach combines cropping and resizing at the same time.

 transform = transforms.RandomResizedCrop\(\(100, 300\)\)  
 imshow\(path, transform\)

Flipping

Flip the image horizontally or vertically, the code below will try to apply a horizontal flip to our image.

 transform = transforms.RandomHorizontalFlip\(\)  
 imshow\(path, transform\)

Padding

Padding includes padding by the specified amount on all edges of the image. We pad each side by 50 pixels.

 transform = transforms.Pad\(\(50,50,50,50\)\)  
 imshow\(path, transform\)

Rotation

Randomly applies a rotation angle to the image. Let's make this angle 15 degrees.

 transform = transforms.RandomRotation\(15\)  
 imshow\(path, transform\)

Random Affine

This technique is a transformation that keeps the center constant. This technique has some parameters:

  • degrees: rotation angle

  • translate: horizontal and vertical translation

  • scale: scaling parameter

  • share: image cropping parameters

    • fillcolor: the color to fill the outside of the image

 transform = transforms.RandomAffine\(1, translate=\(0.5, 0.5\), scale=\(1, 1\), shear=\(1,1\), fillcolor=\(256,256,256\)\)  
 imshow\(path, transform\)

Gaussian Blur

The image will be blurred using Gaussian Blur.

 transform = transforms.GaussianBlur\(7, 3\)  
 imshow\(path, transform\)

Grayscale

Convert a color image to grayscale.

 transform = transforms.Grayscale\(num\_output\_channels=3\)  
 imshow\(path, transform\)

Color enhancement, also known as color dithering, is the process of modifying the color properties of an image by changing the image's pixel values. The following methods are all color-related operations.

Brightness

Change the Brightness of an Image The resulting image is darkened or lightened when compared to the original image.

 transform = transforms.ColorJitter\(brightness=2\)  
 imshow\(path, transform\)

Contrast

The degree of distinction between the darkest and brightest parts of an image is known as contrast. The contrast of the image can also be adjusted as an enhancement.

 transform = transforms.ColorJitter\(contrast=2\)  
 imshow\(path, transform\)

Saturation

The separation of colors in a picture is defined as saturation.

 transform = transforms.ColorJitter\(saturation=20\)  
 imshow\(path, transform\)

Hue

Hue is defined as the shade of color in a picture.

 transform = transforms.ColorJitter\(hue=2\)  
 imshow\(path, transform\)

Summarize

Variations in the image itself will help the model generalize to unseen data so that it does not overfit to the data. The above are all our common data enhancement techniques. Torchvision also contains many methods, which can be found in his documentation: https://pytorch.org/vision/stable/transforms.html

Guess you like

Origin blog.csdn.net/qq_45368632/article/details/127562439