Image manipulation for digital image processing

Image manipulation plays a vital role in computer vision and image processing. These operations are critical for tasks such as preprocessing, enhancing image quality, and enabling advanced algorithms. In computer vision, operations such as resizing, cropping, adjusting brightness/contrast/gamma and geometric transformations are fundamental. They allow efficient computation, extraction of regions of interest, normalization of image intensities and geometric calibration. Likewise, in image processing, these operations are critical for downsampling, cropping unwanted areas, enhancing visibility and quality, and performing geometric operations.

adjust size

Resizing images is common in various scenarios and can serve different purposes, such as fitting an image to a specific size or reducing the file size. Image interpolation and resampling are techniques used in image processing and computer vision to resize or scale images.

Image interpolation

Image interpolation is the process of estimating pixel values ​​at unknown locations within an image based on known pixel values. Different interpolation methods use different ways to estimate the value of unknown pixels.

Nearest neighbor interpolation assigns the value of an unknown pixel location to the nearest known pixel value. This method is simple but can result in blocking artifacts and loss of detail.

5350b87a5824f48e971e48ac1e795951.jpeg

nearest neighbor interpolation

Bilinear interpolation takes into account the values ​​of the four nearest known pixels and calculates a weighted average to estimate the value of the unknown pixel. It produces smoother results than nearest neighbor interpolation, but may still introduce some blurring.

Bicubic interpolation extends bilinear interpolation by considering more neighboring pixels and using a cubic polynomial to estimate pixel values. This method can provide higher quality results, with smoother transitions and better preservation of image details.

import cv2
import numpy as np


def resize_image(image, scale, interpolation):
    width = int(image.shape[1] * scale)
    height = int(image.shape[0] * scale)
    resized_image = cv2.resize(image, (width, height), interpolation=interpolation)
    return resized_image


SCALE = 4


# Load the image
image_path = "image.png"
image = cv2.imread(image_path)


# Resize the image using nearest neighbor interpolation
nearest_neighbor_resized = resize_image(image, scale=SCALE, interpolation=cv2.INTER_NEAREST)


# Resize the image using bilinear interpolation
bilinear_resized = resize_image(image, scale=SCALE, interpolation=cv2.INTER_LINEAR)


# Resize the image using bicubic interpolation
bicubic_resized = resize_image(image, scale=SCALE, interpolation=cv2.INTER_CUBIC)

9310e578c4863e3bf431cf42f4234ae9.jpeg

Crop

The purpose of cropping an image is to remove unwanted content or focus on a specific area of ​​interest. Cropping allows you to optimize your composition, remove distractions, and highlight important elements of your image. Removing unnecessary or irrelevant parts creates a visually appealing and impactful image that effectively conveys the intended message or theme.

Different methods can be used to determine the cropping area:

  • Manual Selection: Manual cropping involves visually inspecting the image and selecting the desired areas to retain. This approach provides flexibility and allows for subjective decisions to be made based on the artistic judgment of the photographer or designer.

  • Object Detection: Automatic cropping technology based on object detection algorithms can identify and extract specific objects or subjects in images. These algorithms analyze images and locate objects based on predefined patterns or trained models. Detected objects can be used as cropping regions, ensuring important elements are retained while irrelevant background or surrounding areas are removed.

  • Segmentation: Images can be divided into meaningful regions using image segmentation techniques such as semantic segmentation or instance segmentation. These techniques assign labels or masks to different objects or regions, making it possible to crop specific parts or isolate specific areas of interest.

import cv2


def crop_image(image, x, y, width, height):
    cropped_image = image[y:y+height, x:x+width]
    return cropped_image


# Example usage
image = cv2.imread("cath.jpeg")
cropped_image = crop_image(image, x=400, y=500, width=300, height=200)
cv2.imshow("Cropped Image", cropped_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Adjustment

Brightness and contrast

Adjusting brightness and contrast is essential to enhance the visibility of your image and increase its visual appeal. Adjusting brightness can make an image appear brighter or darker, highlighting details in underexposed or overexposed areas. Contrast adjustment enhances the difference between light and dark areas, making images appear clearer and more dynamic.

By controlling brightness and contrast, you can improve the overall quality and readability of your image, ensuring that important features are clearly visible.

import cv2
import numpy as np


image_path = "cath.jpeg"


def adjust_brightness(image, value):
    # Convert the image to the HSV color space
    hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)


    # Split the channels
    h, s, v = cv2.split(hsv)


    # Apply the brightness adjustment
    v = cv2.add(v, value)


    # Clamp the values to the valid range of 0-255
    v = np.clip(v, 0, 255)


    # Merge the channels back together
    hsv = cv2.merge((h, s, v))


    # Convert the image back to the BGR color space
    adjusted_image = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)


    return adjusted_image


def adjust_contrast(image, value):
    # Convert the image to the LAB color space
    lab = cv2.cvtColor(image, cv2.COLOR_BGR2LAB)


    # Split the channels
    l, a, b = cv2.split(lab)


    # Apply the contrast adjustment
    l = cv2.multiply(l, value)


    # Clamp the values to the valid range of 0-255
    l = np.clip(l, 0, 255)


    # Merge the channels back together
    lab = cv2.merge((l, a, b))


    # Convert the image back to the BGR color space
    adjusted_image = cv2.cvtColor(lab, cv2.COLOR_LAB2BGR)


    return adjusted_image


# Load the image


image = cv2.imread(image_path)


# Adjust the brightness
brightness_adjusted = adjust_brightness(image, value=50)


# Adjust the contrast
contrast_adjusted = adjust_contrast(image, value=2)


# Display the original and adjusted images
cv2.imshow("Original", image)
cv2.imshow("Brightness Adjusted", brightness_adjusted)
cv2.imshow("Contrast Adjusted", contrast_adjusted)
cv2.waitKey(0)
cv2.destroyAllWindows()

6bb0bf24ce482dd08d6de1d6c302ed60.jpeg

Histogram equalization

Histogram equalization is a technique used to enhance contrast. It does this by redistributing pixel intensity values ​​to cover a wider range of values. Its main goal is to obtain a more even distribution of pixel intensities through the image.

Histogram equalization enhances the contrast of an image by redistributing pixel intensities.

import cv2
import matplotlib.pyplot as plt


image_path = "cath.jpeg"
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)


# Apply histogram equalization
equalized_image = cv2.equalizeHist(image)


# Calculate histograms
hist_original = cv2.calcHist([image], [0], None, [256], [0, 256])
hist_equalized = cv2.calcHist([equalized_image], [0], None, [256], [0, 256])


# Plot the histograms
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.plot(hist_original, color='b')
plt.title("Original Image Histogram")
plt.xlabel("Pixel Intensity")
plt.ylabel("Frequency")


plt.subplot(1, 2, 2)
plt.plot(hist_equalized, color='r')
plt.title("Equalized Image Histogram")
plt.xlabel("Pixel Intensity")
plt.ylabel("Frequency")


plt.tight_layout()
plt.show()

c8bc71848ac9d182abd19277e9f712fd.jpeg

Histogram

# Display the original and equalized images
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
axes[0].imshow(image, cmap='gray')
axes[0].set_title("Original")
axes[0].axis("off")


axes[1].imshow(equalized_image, cmap='gray')
axes[1].set_title("Equalized")
axes[1].axis("off")


plt.tight_layout()
plt.show()

0a76d9b01d736366e1c82a0edc4f717c.jpeg

equalize image

linear scaling

Linear scaling, also known as contrast stretching, is used to adjust the brightness and contrast of an image by linearly mapping original pixel values ​​to a new range. The process involves rescaling pixel values ​​based on the minimum and maximum values ​​in the image to take advantage of the full dynamic range.

Linear scaling allows precise control over adjustments to brightness and contrast. You can define the required intensity range based on your specific requirements.

import cv2
import numpy as np
import matplotlib.pyplot as plt


# Load the image
image_path = "cath.jpeg"
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)


# Calculate the minimum and maximum pixel values in the image
min_value = np.min(image)
max_value = np.max(image)


# Define the desired minimum and maximum intensity values for the output image
new_min = 5
new_max = 10


# Perform linear scaling
scaled_image = cv2.convertScaleAbs(image, alpha=(new_max - new_min) / (max_value - min_value),
                                   beta=new_min - min_value * (new_max - new_min) / (max_value - min_value))


# Display the original and scaled images
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
axes[0].imshow(cv2.cvtColor(image, cv2.COLOR_GRAY2RGB))
axes[0].set_title("Original")
axes[0].axis("off")


axes[1].imshow(scaled_image, cmap='gray')
axes[1].set_title("Scaled")
axes[1].axis("off")


plt.tight_layout()
plt.show()

587c0eebe9173f8e3bd7125bff2a30cd.jpeg

Linear scaling.

Gamma correction

Gamma correction is a technique used to correct the non-linear intensity relationship between image input pixel values ​​and display output intensity. It takes into account the nonlinear response of the human visual system to light and aims to achieve more accurate and perceptually consistent image representations.

The relationship between pixel values ​​captured by a camera or stored in an image file and human-perceived brightness is non-linear. In other words, a linear increase in pixel value does not result in a linear increase in perceived brightness. This nonlinear relationship is due to the response characteristics of the imaging sensor and the human visual system.

Gamma correction is based on a parameter called gamma (γ). Gamma represents the relationship between input pixel values ​​and display output intensity. It is a measure of the non-linear mapping between the two.

Gamma correction applies a power-law transformation to pixel values, adjusting intensity values ​​to correct for non-linear response. The formula for gamma correction is as follows:

Correction value = input value ^ (1 / gamma)

Here, the input value represents the original pixel value and the correction value represents the adjusted pixel value.

  • The main function of gamma correction is to compensate for non-linear intensity relationships and ensure that colors and details in the image are accurately represented. Here’s how gamma correction plays an important role:

  • Brightness compensation: Gamma correction helps compensate for differences in brightness response between capture and display devices. It ensures that the perceived brightness levels in the displayed image are consistent with the original scene.

  • Contrast enhancement: Gamma correction can enhance the contrast of an image by redistributing tonal values. Depending on the gamma value, it can effectively emphasize details in dark or light areas of an image.

  • Color Accuracy: Gamma correction helps achieve accurate color representation. By adjusting the gamma value, you can improve color reproduction, ensuring colors look more natural and faithful to the original scene.

  • Tone mapping: In high dynamic range (HDR) imaging, gamma correction is often used as part of a tone mapping technique to map the wide dynamic range of a scene to the limited dynamic range of the display device. Gamma correction helps preserve detail in shadow and highlight areas, preventing information loss.

  • Perceptually consistent: Gamma correction aims to achieve perceptually consistent images, where the displayed intensities are consistent with human visual perception. By correcting for non-linear response, gamma correction ensures that images appear visually pleasing and realistic to the viewer.

import cv2
import numpy as np


image_path = "cath.jpeg"


def adjust_gamma(image, gamma):
    # Build a lookup table mapping the input pixel values to the corrected gamma values
    lookup_table = np.array([((i / 255.0) ** gamma) * 255 for i in np.arange(0, 256)]).astype(np.uint8)


    # Apply gamma correction using the lookup table
    gamma_corrected = cv2.LUT(image, lookup_table)


    return gamma_corrected


# Load the image


image = cv2.imread(image_path)


# Adjust the gamma value
gamma_value = 1.5
gamma_corrected = adjust_gamma(image, gamma_value)


# Display the original and gamma-corrected images
cv2.imshow("Original", image)
cv2.imshow("Gamma Corrected", gamma_corrected)
cv2.waitKey(0)
cv2.destroyAllWindows()

c873812fbf69d7f0988bbd1b14715adc.jpeg

Gamma correction.

geometric transformation

Geometric transformations change the perspective, orientation, and spatial relationships of an image. These transformations provide basic tools for tasks such as image alignment, object detection, image registration, and more.

Pan

Translation is a basic geometric transformation that involves moving an image a specified distance horizontally or vertically.

import cv2
import numpy as np


image_path = "cath.jpeg"
image = cv2.imread(image_path)


# Define the translation matrix
tx = 100  # pixels to shift in the x-axis
ty = 50  # pixels to shift in the y-axis
translation_matrix = np.float32([[1, 0, tx], [0, 1, ty]])


# Apply translation
translated_image = cv2.warpAffine(image, translation_matrix, (image.shape[1], image.shape[0]))


# Display the original and translated images
cv2.imshow("Original", image)
cv2.imshow("Translated", translated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

fa659d9128142556e5870765fb2cf6b5.jpeg

Pan

Zoom

Scaling refers to adjusting the size of an image, either by applying a uniform scaling factor to all dimensions, or by using different scaling factors for different dimensions. Zoomed.

# Define the scaling factors
scale_x = 1.5  # scaling factor for the x-axis
scale_y = 0.8  # scaling factor for the y-axis


# Apply scaling
scaled_image = cv2.resize(image, None, fx=scale_x, fy=scale_y, interpolation=cv2.INTER_LINEAR)


# Display the original and scaled images
cv2.imshow("Original", image)
cv2.imshow("Scaled", scaled_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

9298ad80f4e845f6633f4851b0fe8c73.jpeg

Zoom

rotate

Rotation is a geometric transformation that involves changing the orientation of an image by a specified angle about a center point.

# Define the rotation angle
angle = 30


# Perform rotation
rows, cols = image.shape[:2]
rotation_matrix = cv2.getRotationMatrix2D((cols / 2, rows / 2), angle, 1)
rotated_image = cv2.warpAffine(image, rotation_matrix, (cols, rows))


# Display the original and rotated images
cv2.imshow("Original", image)
cv2.imshow("Rotated", rotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

d60d50c01da90c9d58542516f8fd3a22.jpeg

rotate

·  END  ·

HAPPY LIFE

1e8700c89002feca5efeed63e95bccb0.png

This article is for learning and communication only. If there is any infringement, please contact the author to delete it.

Guess you like

Origin blog.csdn.net/weixin_38739735/article/details/134978649