Practical Guide to Python Image Processing: 1~5

Original text: Hands-On Image Processing with Python

License: CC BY-NC-SA 4.0

Translator: Feilong

This article comes from [ApacheCN Computer Vision Translation Collection] , using the post-translation editing (MTPE) process to improve efficiency as much as possible.

When others say you have no bottom line, you'd better not; when others say you've done something, you'd better really do it.

1. Introduction to image processing

As the name suggests, image processing can be simply defined as the processing (analysis and manipulation) of images using algorithms in a computer (through code). It has several different aspects such as image storage, representation, information extraction, manipulation, enhancement, recovery and interpretation. In this chapter, we will provide a basic introduction to all these different aspects of image processing and introduce practical image processing using Python libraries. All code examples in this book will use Python 3.

We'll start by defining what image processing is and its applications. Then we will find out. . .

What is image processing and some applications

Let's first define what an image is, how it is stored on the computer, and how to process it using Python.

What is an image and how is it stored on the computer

Conceptually, an image in its simplest form ( single channel ; e.g., binary or monochrome, grayscale, or black and white image) is a 2D function f(x,y) that maps pairs of coordinates to integers/reals, with Point intensity/color dependent. Each point is called a pixel or pixel (picture element). An image can also have multiple channels (such as a color RGB image, where color can be represented by three channels: red, green, and blue). For color RGB images, each pixel at the (x, y) coordinate can be represented by the triplet (r x, y , g x, y , b x, y) .

To be able to process it on a computer, an image *f(x,y)* needs to be digitized spatially and spatially. . .

What is image processing?

Image processing refers to the automatic processing, manipulation, analysis, and interpretation of images using algorithms and codes on computers. It has applications in many disciplines and fields of science and technology such as television, photography, robotics, remote sensing, medical diagnosis, and industrial testing. Social networking sites like Facebook and Instagram are prime examples of industries that require the use/innovation of many image processing algorithms to process the images we upload. We have become accustomed to using these sites in our daily lives, uploading tons of images every day.

In this book, we will use some Python packages to process images. First, we will use a set of libraries for classic image processing: starting from extracting image data, transforming the data using some algorithms, using library functions for preprocessing, enhancement, restoration, representation (using descriptors), segmentation, classification, detection and identification (objects) to analyze, understand, and better interpret data. Next, we will use another set of libraries for deep learning based image processing, a technique that has become very popular in the past few years.

Some applications of image processing

Some typical applications of image processing include medical/biological fields (e.g., X-ray and CT scans), computational photography (Photoshop), fingerprint authentication, face recognition, etc.

Image processing pipeline

The following steps describe the basic steps in the image processing pipeline:

  1. Capture and Storage****Storage : An image needs to be captured (e.g. using a camera) and stored as a file (e.g. a JPEG file) on some device (e.g. a hard drive).

  2. Load to memory and save to disk : The image needs to be read from disk to memory and numpy ndarraystored using some data structure (e.g. ), and the data structure needs to be serialized into the image file later, probably after running some algorithm on the image .

  3. **Operation, Enhancement, and Recovery:** We need to run some preprocessing algorithms to do the following:

    • Run some transformation on the image (sampling and manipulation; e.g. grayscale conversion)
    • Enhance image quality (filtering; e.g., deblurring)
    • Recover images unaffected by noise
  4. Segmentation : In order to extract objects of interest, the image needs to be segmented.

  5. Information extraction/representation : The image needs to be represented in some alternative form; for example, one of the following options:

    • Some hand-crafted feature descriptors can be computed from images (e.g. HOG descriptors using classical image processing)
    • Some features can be learned automatically from images (e.g. weights and bias values ​​learned in the hidden layers of neural networks through deep learning)

    • The image will be represented using this alternative notation
  6. Image Understanding/Interpretation**: **This representation will be used to better understand the image through:

    • Image classification (e.g., does the image contain a human object)
    • Object recognition* ( e.g. , *find the location of a car object in an image with borders)

The following figure describes the different steps in image processing:

The diagram below represents the different modules we will use for different image processing tasks:

In addition to these libraries, we will use the following libraries:

  • scipy.ndimageand opencvfor different image processing tasks
  • scikit-learnfor classic machine learning
  • tensorflowand kerasfor deep learning

Setting up different image processing libraries in Python

The next few paragraphs describe how to install different image processing libraries and set up the environment to write code to process images using classic image processing techniques in Python. In the final chapters of the book, when we use deep learning-based methods, we will need to use different settings.

install pip

We will be using the pipor pip3tool to install the library, so we need to install it first if it is not already installed pip. As mentioned in this article ( https://pip.pypa.io/en/stable/installing/#do-i-need-to-install-pip ) pip, it is already installed if we use Python 3 downloaded from Python.org >= 3.4, or if we are working in a virtual environment ( ) https://packaging.python.org/tutorials/installing-packages/#creating-using virtualenv( created virtual environment ) https://packaging.python.org/key_projects/ #virtualenv or pyvenv( https://packaging.python.org/key_projects/#venv . We just need to make sure to upgrade pip( https://pip.pypa.io/en/stable/installing/#upgrading-pip . How to do this for different Operating system or platform installation pipcan be found here: https://stackoverflow.com/questions/6587507/how-to-install-pip-with-python-3 . **

***# Install some image processing libraries in Python

In Python, there are many libraries available for image processing. The ones we will be using are: NumPy, SciPy, scikit image, PIL (Pillow), OpenCV, scikit learn, SimpleITK and Matplotlib.

matplotlibThe library is primarily for display and numpywill be used for storing images. scikit-learnThe library will be used to build machine learning models for image processing, and scipythe library will be mainly used for image enhancement. scikit-image, mahotasand opencvlibraries will be used for different image processing algorithms.

The following code block shows how the library we are going to use is passed pipfrom P. . .

Install Anaconda patch panel

We also recommend downloading and installing the latest version of the Anaconda distribution; this will eliminate the need to explicitly install many Python packages

More about installing Anaconda for different OSes can be found at https://conda.io/docs/user-guide/install/index.html.

Install Jupyter Notebook

We will use Jupyter Notebook to write Python code. Therefore, we need to first install the package with Python prompt >>> pip install jupyterand then launch the Jupyter Notebook application jupyterin the browser using . >>> jupyter notebookFrom there, we can create a new Python notebook and select a kernel. If we use Anaconda, we don't need to install Jupyter explicitly; the latest Anaconda distribution comes with Jupyter.

More about running Jupyter notebooks can be found at http://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/execute.html.

We can even install Python packages in the laptop unit; for example, we can !pip install scipyinstall using the command scipy.

For more information …

Image I/O and display using Python

Images are stored on disk as files, so reading and writing images from files are disk I/O operations. These can be achieved in a variety of ways using different libraries; some of them are shown in this section. Let's first import all required packages:

# for inline image display inside notebook
# % matplotlib inline 
import numpy as np
from PIL import Image, ImageFont, ImageDraw
from PIL.ImageChops import add, subtract, multiply, difference, screen
import PIL.ImageStat as stat
from skimage.io import imread, imsave, imshow, show, imread_collection, imshow_collection
from skimage import color, viewer, exposure, img_as_float, data
from skimage.transform import SimilarityTransform, warp, swirl
from skimage.util import invert, random_noise, montage
import matplotlib.image as mpimg
import matplotlib.pylab as plt
from scipy.ndimage import affine_transform, zoom
from scipy import misc

Read, save and display images using PIL

The PIL function reads the image open()from Imagedisk in the object as shown in the following code. Images PIL.PngImagePlugin.PngImageFileare loaded as objects of a class and we can use properties like width, height and mode to find the size ( width x height pixels or image resolution) and mode of the image:

im = Image.open("../images/parrot.png") # read the image, provide the correct pathprint(im.width, im.height, im.mode, im.format, type(im))# 453 340 RGB PNG <class 'PIL.PngImagePlugin.PngImageFile'>im.show() # display the image 

Here is the output of the previous code:

The following code block shows how to use PIL functions convert()to convert. . .

Provide the correct path to the image on disk

We recommend creating a folder (subdirectory) to store the images you want to use for processing (for example, for the Python code example, we used imagesimages stored in a folder named ) and then providing the path to the folder to access the images, to avoid file not foundexceptions.

Read, save and display images using Matplotlib

The next code block shows how to read an image in floating point using the functions matplotlib.imagein . Pixel values ​​are represented as real values ​​between 0 and 1:imread()numpy ndarray

im = mpimg.imread("../images/hill.png")  # read the image from disk as a numpy ndarrayprint(im.shape, im.dtype, type(im))      # this image contains an α channel, hence num_channels= 4# (960, 1280, 4) float32 <class 'numpy.ndarray'>plt.figure(figsize=(10,10))plt.imshow(im) # display the imageplt.axis('off')plt.show()

The following image shows the output of the previous code:

Next code snippet. . .

Interpolation when displaying using Matplotlib imshow()

Matplotlib's imshow()functions provide many different types of interpolation methods for plotting images. These functions are particularly useful when the image to be printed is small. Let's use the small 50 x 50 image shown below lenato see the effect of drawing with different interpolation methods:

The next code block demonstrates how to imshow()use different interpolation methods in:

im = mpimg.imread("../images/lena_small.jpg") # read the image from disk as a numpy ndarray
methods = ['none', 'nearest', 'bilinear', 'bicubic', 'spline16', 'lanczos']
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(15, 30),
 subplot_kw={
    
    'xticks': [], 'yticks': []})
fig.subplots_adjust(hspace=0.05, wspace=0.05)
for ax, interp_method in zip(axes.flat, methods):
 ax.imshow(im, interpolation=interp_method)
 ax.set_title(str(interp_method), size=20)
plt.tight_layout()
plt.show()

The following image shows the output of the previous code:

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-IuA9mjlQ-1681961323628) (null)]

Read, save and display images using scikit-image

The next block of code uses scikit-imagethe imread()function in to read an image uint8of type numpy ndarray(8-bit unsigned integer). Therefore, the pixel value will be between 0 and 255. Then use Image.colorthe module's hsv2rgb()functions to convert the color RGB image to an HSV image (changing the image type or mode, discussed later). Next, change the saturation (chroma) of all pixels to a constant value by keeping the hue and value channels unchanged. Then use rgb2hsv()the function to convert the image back to RGB mode to create a new image, then save and display:

im = imread("../images/parrot.png") # read ...

Astronaut dataset using scikit image

The code block below shows how to use datathe module to load images from scikit-imagethe library's image dataset astronaut. This module contains some other popular datasets, such as cameraman, which can be loaded similarly:

im = data.astronaut() 
imshow(im), show()

The following image shows the output of the previous code:

Read and display multiple images at once

We can use the scikit image iomodule's imread_collection()function to load all images with a specific pattern in their file names into a folder and imshow_collection()display them simultaneously with the function. The code is left as an exercise for the reader.

Read, save and display images using scipy misc

scipymisc模块也可用于图像 I/O 和显示。以下各节演示如何使用misc模块功能。

使用 scipy.misc 的人脸数据集

下一个代码块展示了如何显示misc模块的face数据集:

im = misc.face() # load the raccoon's face imagemisc.imsave('face.png', im) # uses the Image module (PIL)plt.imshow(im), plt.axis('off'), plt.show()

下图为前一代码的输出,显示misc模块的face画面:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FJJlUBua-1681961326681)(null)]

我们可以使用misc.imread()从磁盘读取图像。下一个代码块显示了一个示例:

im = misc.imread('../images/pepper.jpg')print(type(im), im.shape, im.dtype)# <class 'numpy.ndarray'> (225, 225, 3) uint8

在 SciPy 1.0.0 中,I/O 函数的imread()已被弃用,并将被删除。。。

处理不同的图像类型和文件格式,并执行基本的图像处理

在本节中,我们将讨论不同的图像处理函数(使用点变换和几何变换)以及如何处理不同类型的图像。让我们从这个开始

处理不同的图像类型和文件格式

图像可以以不同的文件格式和不同的模式(类型)保存。让我们讨论如何使用 Python 库处理不同文件格式和类型的图像

文件格式

图像文件可以是不同的格式。一些流行的格式包括 BMP(8 位、24 位、32 位)、PNG、JPG(JPEG)、GIF、PPM、PNM 和 TIFF。我们不需要担心图像文件的特定格式(以及元数据的存储方式)来从中提取数据。Python 图像处理库将读取图像并提取数据,以及其他一些对我们有用的信息(例如,图像大小、类型/模式和数据类型)。

从一种文件格式转换为另一种文件格式

使用 PIL,我们可以读取一种文件格式的图像并将其保存到另一种文件格式;例如,从 PNG 到 JPG*,*如下图所示:

im = Image.open("../images/parrot.png")print(im.mode)  #  RGBim.save("../images/parrot.jpg")

But if the PNG file is in RGBAmodal, we need to convert it to RGBmodal first and then save it as JPG, otherwise an error will appear. The next code block shows how to first convert and then save:

im = Image.open("../images/hill.png")print(im.mode)# RGBAim.convert('RGB').save("../images/hill.jpg") # first convert to RGB mode

Image type (mode)

Images can be of different types:

  • Each pixel of a single-channel image is represented by a single value:
    • Binary (monochrome) image (each pixel is represented by a single 0-1 bit)
    • Grayscale image (each pixel can be represented by 8 bits, and its value is usually between 0-255)
  • Each pixel of a multi-channel image is represented by a set of values:
    • Three-channel image; for example, the following:
      • Each pixel of an RGB image is represented by three tuples ( r, g, b values), representing the red, green, and blue channel color values ​​of each pixel.
      • Each pixel of the HSV image is represented by three tuples ( h, s, v values, which respectively represent the hue (color), saturation (the degree of mixing of color and white) and value (the degree of mixing of brightness and black) of each pixel. Channel color values. The HSV model describes color in a manner similar to how the human eye perceives color
    • Four-channel image; for example, an RGBA image has each pixel represented by three tuples ( r, g, b, alpha value), with the last channel representing transparency.

Convert from one image mode to another

We can convert the RGB image to grayscale while reading the image itself. The code below does exactly that:

im = imread("images/parrot.png", as_gray=True)print(im.shape)#(362L, 486L)

Note that for some color images some information may be lost when converting to grayscale. The following code shows an example of Ishihara cards for detecting color blindness. This time using colorthe module's rgb2gray()functionality, the color and grayscale images are displayed side by side. As shown in the image below, the number 8 is barely visible in the grayscale version:

im = imread("../images/Ishihara.png")im_g = color.rgb2gray(im)plt.subplot(121), plt.imshow(im, ...

certain color spaces (channels)

Here are a few common channels/color spaces for images: RGB, HSV, XYZ, YUV, YIQ, YPbPr, YCbCr, and YDbDr. We can use affine mapping to go from one color space to another. The following matrix represents the linear mapping from RGB to YIQ color space:

Convert from one color space to another

We can use library functions to convert from one color space to another; for example, the following code converts an RGB color space to an HSV color space image:

im = imread("../images/parrot.png")im_hsv = color.rgb2hsv(im)plt.gray()plt.figure(figsize=(10,8))plt.subplot(221), plt.imshow(im_hsv[...,0]), plt.title('h', size=20), plt.axis('off')plt.subplot(222), plt.imshow(im_hsv[...,1]), plt.title('s', size=20), plt.axis('off')plt.subplot(223), plt.imshow(im_hsv[...,2]), plt.title('v', size=20), plt.axis('off')plt.subplot(224), plt.axis('off')plt.show()

The image below shows h ( heu or color: the dominant wavelength of the reflected light , s ( saturation or chroma ) and v ( value or brightness/… )

Data structure used to store images

As we have already discussed, PIL uses Imageobjects to store images, while scikit-image uses numpy ndarraydata structures to store image data. The next section describes how to convert between these two data structures.

Convert image data structure

The following code block shows how to convert a PIL Imageobject to numpy ndarray(used by scikit-image):

im = Image.open('../images/flowers.png') # read image into an Image object with PILim = np.array(im) # create a numpy ndarray from the Image objectimshow(im) # use skimage imshow to display the imageplt.axis('off'), show()

The next image shows the output of the previous code, which is an image of a flower:

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-orrexDs4-1681961326898) (null)]

The following code block shows how to numpy ndarrayconvert from PIL Image. When run, the code displays the same output as above:

im = imread('../images/flowers.png') ...

basic image processing

Different Python libraries are available for basic image processing. Almost all libraries store images in numpy ndarray(for example, a two-dimensional array for grayscale and a three-dimensional array for RGB images). The following figure shows the positive x and ylena directions of a color image (the origin is the upper left corner of the 2D array of the image):

Image processing based on numpy array slices

The next code block shows how to create a circular mask on an image using numpyslices and masks of an array :lena

lena = mpimg.imread("../images/lena.jpg") # read the image from disk as a numpy ndarrayprint(lena[0, 40])# [180  76  83]# print(lena[10:13, 20:23,0:1]) # slicinglx, ly, _ = lena.shapeX, Y = np.ogrid[0:lx, 0:ly]mask = (X - lx / 2) ** 2 + (Y - ly / 2) ** 2 > lx * ly / 4lena[mask,:] = 0 # masksplt.figure(figsize=(10,10))plt.imshow(lena), plt.axis('off'), plt.show()

The image below shows the output of the code:

Simple image warping - alpha blending of two images using cross-dissolve

The code block below shows how to start with one image of a face (*Image1 * is Messi's face) and then use numpy ndarraysa linear combination of the two images to end up with another image (*Image2 * is Ronaldo's face) At the end, the formula is as follows:

We achieve this by iteratively increasing α from 0 to 1:

im1 = mpimg.imread("../images/messi.jpg") / 255 # scale RGB values in [0,1]
im2 = mpimg.imread("../images/ronaldo.jpg") / 255
i = 1
plt.figure(figsize=(18,15))
for alpha in np.linspace(0,1,20):
 plt.subplot(4,5,i)
 plt.imshow((1-alpha)*im1 + alpha*im2)
 plt.axis('off')
 i += 1
plt.subplots_adjust(wspace=0.05, hspace=0.05)
plt.show()

The next image shows a sequence of alpha blended images created using the previous code, which cross-decomposes images of Messi's face into Ronaldo's. As can be seen from the middle image sequence in the figure, the simple blended face deformation is not very smooth. In the following chapters, we will see more advanced image deformation techniques:

PIL-based image processing

PIL provides us with many functions for manipulating images; for example, changing pixel values ​​using point transformations or performing geometric transformations on the image. Let us first load the parrot PNG image as shown in the following code:

im = Image.open("../images/parrot.png")        # open the image, provide the correct pathprint(im.width, im.height, im.mode, im.format) # print image size, mode and format# 486 362 RGB PNG

The next few sections will describe how to use PIL for different types of image processing.

Crop image

crop()We can crop the corresponding area from the image using a function with the desired rectangle parameter , as shown in the following code:

im_c = im.crop((175,75,320,200)) # crop the rectangle given by (left, top, right, bottom) from the image
im_c.show()

The image below shows the cropped image created using the previous code:

Resize image

In order to increase or decrease the size of an image, we can use resize()functions that internally upsample or downsample the image respectively. This will be discussed in detail in the next chapter.

Resize to larger image

Let's start with a small clock image sized 149 x 97 and then create a larger image. The code snippet below shows the small clock image we will start with:

im = Image.open("../images/clock.jpg")print(im.width, im.height)# 107 105im.show()

The output of the previous code, the small clock image, looks like this:

The next line of code shows how to use resize()the function. . .

negative image

We can use point()function to transform each pixel value with a parametric function. We can use it to negate the image as shown in the next code block. Pixel values ​​are represented using 1-byte unsigned integers, which is why subtracting from the maximum possible value will be the exact point operation required on each pixel to obtain the inverted image:

im = Image.open("../images/parrot.png") 
im_t = im.point(lambda x: 255 - x)
im_t.show()

The image below shows the negative image, the output of the previous code:

Convert image to grayscale

We can convert an RGB color image to a grayscale image using a function with 'L'parameters , as shown in the following code:convert()

im_g = im.convert('L')   # convert the RGB color image to a grayscale image

We will use this image in the next few grayscale transformations.

some grayscale transformations

Here we explore several transformations where a function is used to convert each individual pixel value of the input image into the corresponding pixel value of the output image. function point()is available for this. The value of each pixel is between 0 and 255 (inclusive).

log transformation

Logarithmic transformation can be used to effectively compress images with dynamic ranges of pixel values. The following code uses a point transformation to perform a logarithmic transformation. As can be seen, the range of pixel values ​​shrinks, with lighter pixels from the input image becoming darker and darker pixels becoming brighter, thus reducing the range of pixel values:

im_g.point(lambda x: 255*np.log(1+x/255)).show()

The following image shows the output log transformation image produced by running the previous line of code:

power law transformation

This transformation is used as a gamma correction of the image. The next line of code shows how to use point()the function to do a power law transformation, where γ = 0.6:

im_g.point(lambda x: 255*(x/255)**0.6).show()

The image below shows the output power law transformed image produced by running the previous line of code:

some geometric transformations

In this section we discuss another set of transformations accomplished by multiplying an appropriate matrix (usually represented by homogeneous coordinates) with the image matrix. These transformations change the geometric orientation of the image, hence the name.

Reflection image

We can use transpose()functions to reflect the image about the horizontal or vertical axis:

im.transpose(Image.FLIP_LEFT_RIGHT).show() # reflect about the vertical axis 

The image below shows the output image produced by running the previous line of code:

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-8agGRlGx-1681961333561) (null)]

Rotate image

We can use rotate()function rotation. . .

Change the pixel values ​​of an image

We can use putpixel()functions to change pixel values ​​in an image. Next, let's discuss a popular application of using functions to add noise to images.

Add salt and pepper noise to images

We can add some salt and pepper noise to the image by randomly selecting a few pixels from the image and setting half of those pixel values ​​to black and the other half to white . The next code snippet shows how to add noise:

# choose 5000 random locations inside image
im1 = im.copy() # keep the original image, create a copy 
n = 5000
x, y = np.random.randint(0, im.width, n), np.random.randint(0, im.height, n)
for (x,y) in zip(x,y):
 im1.putpixel((x, y), ((0,0,0) if np.random.rand() < 0.5 else (255,255,255))) # salt-and-pepper noise
im1.show()

The image below shows the output noise image generated by running the previous code:

Draw a picture

We can draw lines or other geometric shapes (for example, a function that draws an ellipse) PIL.ImageDrawon an image from the module , as shown in the next Python code snippet:ellipse()

im = Image.open("../images/parrot.png")draw = ImageDraw.Draw(im)draw.ellipse((125, 125, 200, 250), fill=(255,255,255,128))del drawim.show()

The following image shows the output image generated by running the previous code:

Draw text on image

We can add text to images using functions PIL.ImageDrawin the module text(), as shown in the next Python code snippet:

draw = ImageDraw.Draw(im)
font = ImageFont.truetype("arial.ttf", 23) # use a truetype font
draw.text((10, 5), "Welcome to image processing with python", font=font)
del draw
im.show()

The following image shows the output image generated by running the previous code:

Create thumbnail

We can thumbnail()create thumbnails from images using functions like this:

im_thumbnail = im.copy() # need to copy the original image firstim_thumbnail.thumbnail((100,100))# now paste the thumbnail on the image im.paste(im_thumbnail, (10,10))
im.save("../images/parrot_thumb.jpg")
im.show()

This figure shows the output image produced by running the previous code snippet:

Calculate basic statistics of an image

We can use statmodules to calculate basic statistics of an image (mean, median, standard deviation, etc. of pixel values ​​of different channels) as follows:

s = stat.Stat(im)
print(s.extrema) # maximum and minimum pixel values for each channel R, G, B
# [(4, 255), (0, 255), (0, 253)]
print(s.count)
# [154020, 154020, 154020]
print(s.mean)
# [125.41305674587716, 124.43517724970783, 68.38463186599142]
print(s.median)
# [117, 128, 63]
print(s.stddev)
# [47.56564506512579, 51.08397900881395, 39.067418896260094]

Plots a histogram of pixel values ​​for the RGB channels of an image

histogram()The function can be used to calculate a histogram (a table of pixel values ​​versus frequency) of each channel pixel and return the concatenated output (for example, for an RGB image, the output contains 3 x 256=768 values):

pl = im.histogram()plt.bar(range(256), pl[:256], color='r', alpha=0.5)plt.bar(range(256), pl[256:2*256], color='g', alpha=0.4)plt.bar(range(256), pl[2*256:], color='b', alpha=0.3)plt.show()

The following image shows the R, G, and B color histogram plotted by running the previous code:

Separate RGB channels of an image

We can split()separate the channels of a multi-channel image using functions, as shown in the following RGB image code:

ch_r, ch_g, ch_b = im.split() # split the RGB image into 3 channels: R, G and B
# we shall use matplotlib to display the channels
plt.figure(figsize=(18,6))
plt.subplot(1,3,1); plt.imshow(ch_r, cmap=plt.cm.Reds); plt.axis('off')
plt.subplot(1,3,2); plt.imshow(ch_g, cmap=plt.cm.Greens); plt.axis('off')
plt.subplot(1,3,3); plt.imshow(ch_b, cmap=plt.cm.Blues); plt.axis('off')
plt.tight_layout()
plt.show() # show the R, G, B channels

The image below shows the three output images created for each of the R (red), G (green), and B (blue) channels by running the previous code:

Combine multiple channels of an image

We can combine merge()the channels of a multi-channel image using functions, as shown in the following code, where the color channels obtained by splitting the Parrot RGB image are merged after swapping the red and blue channels:

im = Image.merge('RGB', (ch_b, ch_g, ch_r)) # swap the red and blue channels obtained last time with split()im.show()

The following image shows the RGB output image created by running the previous code snippet to merge the B, G, and R channels:

Alpha-blending two images

blend()The function can be used to create a new image by interpolating two given images (of the same size) using a constant α . Both images must be of the same size and mode. The output image looks like this:

out=image1 ( 1.0 -α)+ image2α

If α is 0.0, a copy of the first image is returned. If α is 1.0, a copy of the second image is returned. The next code snippet shows an example:

im1 = Image.open("../images/parrot.png")
im2 = Image.open("../images/hill.png")
# 453 340 1280 960 RGB RGBA
im1 = im1.convert('RGBA') # two images have different modes, must be converted to the same mode
im2 = im2.resize((im1.width, im1.height), Image.BILINEAR) # two images have different sizes, must be converted to the same size
im = Image.blend(im1, im2, alpha=0.5).show()

The image below shows the output image generated by blending the first two images:

Overlay two images

One image can be superimposed onto another by multiplying two input images (of the same size) pixel by pixel. The next code snippet shows an example:

im1 = Image.open("../images/parrot.png")im2 = Image.open("../images/hill.png").convert('RGB').resize((im1.width, im1.height))multiply(im1, im2).show()

The following image shows the output image produced by running the previous code snippet to overlay two images:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-eoa8auwN-1681961321680)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/9ecfd999-8b3a-4302-85ef-d164447185d5.png)]

Add two images

The next code snippet shows how to generate an image by adding two input images (of the same size) pixel by pixel:

add(im1, im2).show()

The image below shows the output image produced by running the previous code snippet:

Calculate the difference between two images

The following code returns the absolute value of the pixel-by-pixel difference between images. Image difference can be used to detect changes between two images. For example, the next code block shows how to calculate a difference image from two consecutive frames of a video recording of a 2018 FIFA World Cup match (from YouTube):

from PIL.ImageChops import subtract, multiply, screen, difference, addim1 = Image.open("../images/goal1.png") # load two consecutive frame images from the videoim2 = Image.open("../images/goal2.png")im = difference(im1, im2)im.save("../images/goal_diff.png")plt.subplot(311)plt.imshow(im1)plt.axis('off')plt.subplot(312)plt.imshow(im2)plt.axis('off')plt.subplot(313) ...

Subtract two images and superimpose two image negatives

subtract()Function can be used to first subtract two images, then divide the result by the scale (default is 1.0) and add an offset (default is 0.0). Similarly, screen()functionality can be used to superimpose two inverted images on top of each other.

Image processing using scikit-image

As we did earlier with the PIL library, we can also use scikit-imagelibrary functions for image processing. The following sections show some examples.

Use the warp() function to perform reverse warping and geometric transformations

The functions of the scikit image transformmodule warp()can be used to inverse warp the geometric transformation of an image (discussed in the previous section), as shown in the following example.

Apply an affine transformation to an image

We can use SimilarityTransform()a function to calculate the transformation matrix and then use warp()the function to transform, as shown in the next code block:

im = imread("../images/parrot.png")
tform = SimilarityTransform(scale=0.9, rotation=np.pi/4,translation=(im.shape[0]/2, -100))
warped = warp(im, tform)
import matplotlib.pyplot as plt
plt.imshow(warped), plt.axis('off'), plt.show()

The image below shows the output image produced by running the previous code snippet:

Apply swirl transform

This is a nonlinear transformation defined in the scikit image documentation. The next code snippet shows how to swirl()implement the transformation using a function, where strengthare the parameters of the function, swirlthe quantity represented, the range of pixels radiusrepresented , and the rotation angle added. The conversion is to ensure that the conversion decays to ≈ one thousandth ≈ 1/1000 within the specified radius:swirlrotationradiusr

im = imread("../images/parrot.png")swirled = swirl(im, rotation=0, strength=15, radius=200)plt.imshow(swirled)plt.axis('off')plt.show()

The next image shows the output image generated by the swirl transformation by running the previous code snippet:

Add random Gaussian noise to image

We can use random_noise()functions to add different types of noise to images. The next code example shows how to add Gaussian noise with different variances to an image:

im = img_as_float(imread("../images/parrot.png"))
plt.figure(figsize=(15,12))
sigmas = [0.1, 0.25, 0.5, 1]
for i in range(4): 
 noisy = random_noise(im, var=sigmas[i]**2)
 plt.subplot(2,2,i+1)
 plt.imshow(noisy)
 plt.axis('off')
 plt.title('Gaussian noise with sigma=' + str(sigmas[i]), size=20)
plt.tight_layout()
plt.show()

The next figure shows the output image produced by running the previous code snippet adding Gaussian noise with different variances. It can be seen that the greater the standard deviation of Gaussian noise, the greater the noise of the output image:

Calculate the cumulative distribution function of an image

We can use cumulative_distribution()the function to calculate the cumulative distribution function ( CDF) of a given image , as we will see in the image enhancement chapter. For now, we encourage readers to use this function to calculate the CDF.

Image processing with Matplotlib

We can use modules matplotlibin the library pylabfor image processing. An example is shown in the next section.

*# Draw contour lines for the image

The contour of an image is a curve that connects all pixels that have the same specific value. The following code block shows how to draw outlines and filled outlines of an Einstein grayscale image:

im = rgb2gray(imread("../images/einstein.jpg")) # read the image from disk as a numpy ndarrayplt.figure(figsize=(20,8))plt.subplot(131), plt.imshow(im, cmap='gray'), plt.title('Original Image', size=20) plt.subplot(132), plt.contour(np.flipud(im), colors='k', levels=np.logspace(-15, 15, 100))plt.title('Image Contour Lines', size=20)plt.subplot(133), plt.title('Image Filled Contour', size=20), plt.contourf(np.flipud(im), cmap='inferno')plt.show()

The next figure shows this. . .

Image processing using scipy.misc and scipy.ndimage modules

We can also use the and modules scipyfrom the library for image processing; this is left as an exercise for the reader to find the relevant functions and become familiar with their usage.miscndimage

Summarize

在本章中,我们首先提供了图像处理的基本介绍,以及关于我们试图在图像处理中解决的问题的基本概念。然后,我们讨论了图像处理的不同任务和步骤,以及 Python 中的主要图像处理库,我们将在本书中使用这些库进行编码。接下来,我们讨论了如何在 Python 中安装用于图像处理的不同库,以及如何导入它们并从模块中调用函数。我们还介绍了有关图像类型、文件格式和数据结构的基本概念,以使用不同的 Python 库存储图像数据。然后,我们讨论了如何使用不同的库在 Python 中执行图像 I/O 和显示。最后,我们讨论了如何。。。

问题

  1. 使用scikit-image库的函数读取图像集合并将其显示为蒙太奇。
  2. 使用scipy ndimagemisc模块的功能对图像进行缩放、裁剪、调整大小和应用仿射变换。
  3. 创建 Gotham Instagram 过滤器的 Python 翻拍版(https://github.com/lukexyz/CV-Instagram-Filters (提示:使用 PILsplit()merge()numpy interp()功能操作图像以创建通道插值(https://www.youtube.com/watch?v=otLGDpBglEA &功能=播放器(嵌入)。
  4. 使用 scikit image 的warp()功能实现漩涡变换。注意,swirl变换也可以用以下等式表示:




  1. 执行以下给出的波形变换(提示:使用 scikit 图像的warp()


  1. Use PIL to load an RGB .pngfile with a color palette and convert it to a grayscale image. This question is taken from this post: https://stackoverflow.com/questions/51676447/python-use-pil-to-load-png-file-gives-strange-results/51678271#51678271 . VOC2012Convert the following RGB image (from the dataset) to grayscale by indexing the palette :

  1. Plot a 3D plot for each color channel of the parrot image used in this chapter (hint: use the power mpl_toolkits.mplot3dof modules plot_surface()and the power of NumPy meshgrid()).

  2. Use the To.T0 module of SCIKIT Image's To.T0 module to estimate the homography matrix from the source to the destination image, and use the AUT2 T2 function to embed the image in a blank canvas (as shown below):

| Input image | Output image |
| [External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-60AbU86m-1681961333418) (null)] | |

First, try to solve the problem yourself. The solution can be found here for your reference: https://sandipanweb.wordpress.com/2018/07/30/some-image-processing-problems/ .

further reading

2. Sampling, Fourier transform and convolution

In this chapter, we will discuss two-dimensional signals in the time and frequency domains. We'll first discuss spatial sampling, an important concept for resizing images, and the challenges in sampling. We will try to solve these problems using functions in the Python library. We will also introduce intensity quantization in images; intensity quantization means how many bits will be used to store pixels in the image, and the impact it will have on image quality. You'll definitely want to know about the Discrete Fourier Transform ( DFT ) that can be used to convert an image from the spatial (time) domain to the frequency domain . You will learn to implement DFT using the Fast Fourier Transform ( FFT) algorithm using numpythe and functions , and will be able to apply this implementation on images!scipy

You will also be interested in learning about 2D convolutions to increase convolution speed. We will also look at the basic concepts of the convolution theorem. We will try to clarify the age-old confusion between correlation and convolution with an example. Additionally, we will describe an example from SciPy that will show you how to use templates to find the location of a specific pattern in an image by applying cross-correlation.

We'll also introduce some filtering techniques and see how to implement them using Python libraries. You will be interested to see the results we get when we use these filters to denoise an image.

The topics we will cover in this chapter are as follows:

  • Image formation – sampling and quantization
  • Discrete Fourier Transform
  • Understanding convolution

Image formation – sampling and quantization

In this section, we will describe two important concepts of image formation, namely sampling and quantization, and see how to resize images using quantization of sampling and color using the PIL and libraries. scikit-imageWe're going to use a hands-on approach here, and we'll define these concepts while seeing them in action. prepared

Let's start by importing all required packages:

% matplotlib inline # for inline image display inside notebookfrom PIL import Imagefrom skimage.io import imread, imshow, showimport scipy.fftpack as fpfrom scipy import ndimage, misc, signalfrom scipy.stats import signaltonoisefrom skimage import data, img_as_floatfrom skimage.color import rgb2grayfrom skimage.transform import ...

sampling

Sampling refers to selecting/rejecting image pixels, which means it is a spatial operation. We can use sampling to increase or decrease the size of an image, using upsampling and downsampling respectively. In the next few sections we will discuss different sampling techniques with examples.

upsampling

As discussed briefly in Chapter 1 , starting with image processing , in order to increase the size of an image, we need to upsample the image. The challenge is that the new larger image will have some pixels that have no corresponding pixels in the original smaller image and we need to guess these unknown pixel values. We can guess the value of the unknown pixel using the following formula:

  • an aggregation, e.g., the average of one or more known pixel neighbor values ​​of its nearest
  • Interpolation using pixel neighborhoods for bilinear or cubic interpolation

Nearest neighbor based upsampling may result in poor output image quality. Let's write code to verify this:

im = Image.open("../images/clock.jpg") # the original small ...

Upsampling and interpolation

In order to improve the image quality of the upsampled output, some interpolation methods can be used, such as bilinear interpolation or bicubic interpolation. Let's see how.

bilinear interpolation

Let's consider a grayscale image, which is basically a two-dimensional matrix of pixel values ​​at integer grid positions. To interpolate pixel values ​​at any point P on the grid , you can use the 2D analog of linear interpolation: bilinear interpolation. In this case, for each possible point P (which we want to interpolate), the intensity values ​​of the four adjacent points (i.e. Q 11 , Q 12 , Q 22 and Q 21 ) will be combined , to calculate the interpolated intensity at point P, as shown in the figure below:

Let's use the PIL resize()function for bilinear interpolation:

im1 = im.resize((im.width*5, im.height*5), Image.BILINEAR) # up-sample with bi-linear interpolation
pylab.figure(figsize=(10,10)), pylab.imshow(im1), pylab.show()

This is the resized image. Note how the quality improves when bilinear interpolation is used with upsampling:

bicubic interpolation

It is an extension of cubic interpolation and is used to interpolate data points on a two-dimensional regular grid. The interpolated surface is smoother than the corresponding surface obtained by bilinear interpolation or nearest neighbor interpolation.

Bicubic interpolation can be done using Lagrangian polynomials, cubic splines, or cubic convolution algorithms. PIL uses cubic spline interpolation in a 4x4 environment.

Let's use the PIL resize()function for bicubic interpolation:

im.resize((im.width*10, im.height*10), Image.BICUBIC).show()  # bi-cubic interpolation
pylab.figure(figsize=(10,10)), pylab.imshow(im1), pylab.show()

See how the quality of the resized image improves when we use bicubic interpolation:

Downsampling

In order to reduce the size of the image, we need to downsample the image. For every pixel in the new smaller image, there will be multiple pixels in the original larger image. We can calculate the value of a pixel in the new image by doing the following:

  • Remove some pixels from a larger image in a systematic way (e.g. if we want the image to be a quarter of the size of the original image, remove one pixel from every other row and column)
  • Calculate the new pixel value as the aggregate value of the corresponding multiple pixels in the original image

Let's take tajmahal.jpgan image and use resize()a function to resize it to an output image that is 25 times smaller than the input image, also from the PIL library:

im = Image.open("../images/tajmahal.jpg") ...

Downsampling and anti-aliasing

As we can see, downsampling is not great for shrinking images because it creates a aliasing effect. For example, if we try to resize the original image by reducing the width and height by a factor of 5 (downsampling), we will get such an incomplete and bad output.

Anti-aliasing

The problem here is that a single pixel in the output image corresponds to 25 pixels in the input image, but we sample the value of a single pixel. We should average over a small area of ​​the input image. This can be achieved using ANTIALIAS(High Quality Downsampling Filter); this is what you can do:

im = im.resize((im.width//5, im.height//5), Image.ANTIALIAS)
pylab.figure(figsize=(15,10)), pylab.imshow(im), pylab.show()

Create an image using PIL with anti-aliasing, same as the previous image but with better quality (hardly any artifacts/aliasing effects):

Anti-aliasing is usually done by smoothing the image before downsampling (by convolving the image with a low-pass filter such as a Gaussian filter)

Now let us use the anti-aliasing functionality transformof scikit-image module to overcome the aliasing problem of another image, i.e. image:rescale()umbc.png

im = imread('../images/umbc.png')
im1 = im.copy()
pylab.figure(figsize=(20,15))
for i in range(4):
    pylab.subplot(2,2,i+1), pylab.imshow(im1, cmap='gray'), pylab.axis('off')
    pylab.title('image size = ' + str(im1.shape[1]) + 'x' + str(im1.shape[0]))
    im1 = rescale(im1, scale = 0.5, multichannel=True, anti_aliasing=False)
pylab.subplots_adjust(wspace=0.1, hspace=0.1)
pylab.show()

The next screenshot shows the output of the previous code. As shown, the image is downsampled to create smaller and smaller outputs. When anti-aliasing technology is not used, the aliasing effect becomes more prominent:

Let's change the line of code to use anti-aliasing:

im1 = rescale(im1, scale = 0.5, multichannel=True, anti_aliasing=True)

This produces better quality images:

To learn more about interpolation and anti-aliasing, please visit my blog: https://sandipanweb.wordpress.com/2018/01/21/recursive-graphics-bilinear-interpolation-and-image-transformation-in-Python/.

Quantify

Quantization is related to the intensity of the image and can be defined by the number of bits used per pixel. Digital images are usually quantized to 256 gray levels. Here we will see that as the number of bits used for pixel storage decreases, the quantization error increases, leading to artificial borders or contours and pixelation, and resulting in poor image quality.

PIL quantization

Let's use the PIL Imagemodule's convert()functions for color quantization, P with mode and color parameters as the maximum possible number of colors. We will also use functions from the SciPy statsmodule signaltonoise()to find the signal- to-noise ratio ( SNRparrot.jpg ) of an image , which is defined as the standard deviation of the image array divided by the mean:

im = Image.open('../images/parrot.jpg')
pylab.figure(figsize=(20,30))
num_colors_list = [1 << n for n in range(8,0,-1)]
snr_list = []
i = 1
for num_colors in num_colors_list:
    im1 = im.convert('P', palette=Image.ADAPTIVE, colors=num_colors)
    pylab.subplot(4,2,i), pylab.imshow(im1), pylab.axis('off')
    snr_list.append(signaltonoise(im1, axis=None))
    pylab.title('Image with # colors = ' + str(num_colors) + ' SNR = ' +
    str(np.round(snr_list[i-1],3)), size=20)
    i += 1
pylab.subplots_adjust(wspace=0.2, hspace=0)
pylab.show()

This shows how image quality degrades with color quantization when the number of bits to store pixels decreases:

Framework two is as follows:

Now we will plot the effect of color quantization on image signal-to-noise ratio. Signal-to-noise ratio is usually a measure of image quality. The higher the signal-to-noise ratio, the better the quality:

pylab.plot(num_colors_list, snr_list, 'r.-')
pylab.xlabel('# colors in the image')
pylab.ylabel('SNR')
pylab.title('Change in SNR w.r.t. # colors')
pylab.xscale('log', basex=2)
pylab.gca().invert_xaxis()
pylab.show()

It can be seen that although color quantization reduces the image size (because the number of bits/pixels is reduced), it also makes the image quality worse, as measured by SNR:

Discrete Fourier Transform

The Fourier transform method has a long mathematical history and we are not going to discuss it here (it can be found in any digital signal processing or digital image processing theory book). As far as image processing is concerned, we will only focus on the 2D Discrete Fourier Transform ( DFT ). The basic idea behind the Fourier Transform method is that the image can be thought of as a 2D function, f, , which can be expressed as sine and cosine Weighted sum (Fourier basis functions) along two dimensions.

We can use DFT to convert from a set of grayscale pixel values ​​in an image (space/time domain) to a set of Fourier coefficients (frequency domain), and it is discrete because of the spatial and temporal transformations. . .

Why do we need DFT?

First, transforming to the frequency domain allows for a better understanding of the image. As we will see in the next few sections, low frequencies in the frequency domain correspond to the average overall level of information in the image, while high frequencies correspond to edges, noise, and more detailed information.

Typically, images are smooth in nature, which is why most images can be represented using a small number of DFT coefficients, while all remaining higher coefficients are almost negligible/zero.

This is very useful in image compression, especially for Fourier sparse images, where only a few Fourier coefficients are needed to reconstruct the image, so only these frequencies can be stored, while others can be discarded, allowing high compression (e.g., in JPEG In image compression algorithms, a similar transform is used, the discrete cosine transform ( DCT ). Furthermore, as we will see later in this chapter, filtering with a DFT in the frequency domain can be much faster than filtering in the spatial domain.

Fast Fourier Transform algorithm for calculating DFT

Fast Fourier Transform ( FFT ) is a divide-and-conquer algorithm for recursively computing DFT, which is much faster (O(N.log 2 N) time complexity) than the much slower O(N2) original for nxn images The calculation is much faster. In Python, both the numpyand scipylibrary provide functions for calculating 2D DFT/IDFT using the FFT algorithm. Let's look at a few examples.

*# FFT with scipy.fftpack module

We will use the /scipy.fftpack function of the module to calculate the DFT/IDFT by using the FFT algorithm of a grayscale image:fft2()ifft2()rhino.jpg

im = np.array(Image.open('../images/rhino.jpg').convert('L')) # we shall work with grayscale image
snr = signaltonoise(im, axis=None)
print('SNR for the original image = ' + str(snr))
# SNR for the original image = 2.023722773801701
# now call FFT and IFFT
freq = fp.fft2(im)
im1 = fp.ifft2(freq).real
snr = signaltonoise(im1, axis=None)
print('SNR for the image obtained after reconstruction = ' + str(snr))
# SNR for the image obtained after reconstruction = 2.0237227738013224
assert(np.allclose(im, im1)) # make sure the forward and inverse FFT are close to each other
pylab.figure(figsize=(20,10))
pylab.subplot(121), pylab.imshow(im, cmap='gray'), pylab.axis('off')
pylab.title('Original Image', size=20)
pylab.subplot(122), pylab.imshow(im1, cmap='gray'), pylab.axis('off')
pylab.title('Image obtained after reconstruction', size=20)
pylab.show()

Here is the output:

It can be seen from the SNR value of the inline output and the visual difference between the input and reconstructed images that the reconstructed image loses some information. The difference is negligible if we use all coefficients obtained for reconstruction

Plot a spectrogram

Since the Fourier coefficients are complex numbers, we can directly observe the magnitude. The magnitude that displays the Fourier transform is called the spectrum of the transform. The value F[0,0] of DFT is called the DC coefficient. ****

The DC coefficient is too large to see the other coefficient values, which is why we need to stretch the transformed values ​​by displaying the logarithm of the transform. Furthermore, for ease of display, the transform coefficients are shifted (used fftshift()) so that the DC component is at the center. Excited about creating Fourier spectroscopy of rhino images? The encoding is as follows:

# the quadrants are needed to be shifted around in order that the low spatial frequencies are in the center of the 2D fourier-transformed ...

FFT with numpy.FFT module

The DFT of an image can be calculated using numpy.fftthe module's similar set of functions. We'll see some examples.

Calculate the amplitude and phase of the DFT

We will use house.pngan image as input to fft2()get the real and imaginary parts of the Fourier coefficients; after that we will calculate the amplitude/spectrum and phase and finally ifft2()reconstruct the image using:

import numpy.fft as fpim1 = rgb2gray(imread('../images/house.png'))pylab.figure(figsize=(12,10))freq1 = fp.fft2(im1)im1_ = fp.ifft2(freq1).realpylab.subplot(2,2,1), pylab.imshow(im1, cmap='gray'), pylab.title('Original Image', size=20)pylab.subplot(2,2,2), pylab.imshow(20*np.log10( 0.01 + np.abs(fp.fftshift(freq1))), cmap='gray')pylab.title('FFT Spectrum Magnitude', size=20)pylab.subplot(2,2,3), pylab.imshow(np.angle(fp.fftshift(freq1)), cmap='gray')pylab.title('FFT ...

Understanding convolution

Convolution is an operation that operates on two images, one an input image and the other a mask (also called a kernel ) that acts as a filter on the input image to produce an output image

Convolutional filtering is used to modify the spatial frequency characteristics of the image. It works by calculating the new value of a pixel in the output image by adding the weighted values ​​of all adjacent pixels to determine the value of the center pixel. The pixel values ​​in the output image are calculated by traversing the kernel window in the input image, as shown in the next screenshot (for convolution in valid mode; we will see convolution mode later in this chapter):

As you can see, the kernel window (marked by the arrow in the input image) traverses the image and after convolution gets the values ​​mapped onto the output image.

Why convolve images?

Convolution applies a general filtering effect to the input image. This is done to achieve various effects on the image using appropriate kernels, such as smoothing, sharpening, and embossing, as well as in operations such as edge detection.

SciPy signal convolution convolution

Functions of the SciPy signals module convolve2d()are available for correlation. We will use this function to apply convolution to an image with a kernel.

Apply convolution to grayscale image

Let's first cameraman.jpgdetect edges from a grayscale image using convolution and Laplacian kernel, and boxblur the image using the kernel:

im = rgb2gray(imread('../image s/cameraman.jpg')).astype(float)print(np.max(im))# 1.0print(im.shape)# (225, 225)blur_box_kernel = np.ones((3,3)) / 9edge_laplace_kernel = np.array([[0,1,0],[1,-4,1],[0,1,0]])im_blurred = signal.convolve2d(im, blur_box_kernel)im_edges = np.clip(signal.convolve2d(im, edge_laplace_kernel), 0, 1)fig, axes = pylab.subplots(ncols=3, sharex=True, sharey=True, figsize=(18, 6))axes[0].imshow(im, cmap=pylab.cm.gray)axes[0].set_title('Original Image', size=20)axes[1].imshow(im_blurred, cmap=pylab.cm.gray)axes[1].set_title('Box Blur', ...

Convolution mode, pad values ​​and boundary conditions

Depending on what you want to do with the edge pixels, there are three parameters: mode, , boundaryand fillvalue, that can be passed to the SciPy convolve2d()function. Here we will briefly discuss modethe arguments:

  • mode='full':Default mode, the output is a fully discrete linear convolution of the input
  • mode='valid': Ignore edge pixels and only count all adjacent pixels (pixels that do not require zero padding). The output image size is smaller than the input image size for all kernels (except 1 x 1)
  • mode='same': The output image is the same size as the input image; it is centered relative to 'full'the output.

Apply convolution to color (RGB) images

Using scipy.convolve2d(), we can also sharpen RGB images. We have to apply convolution to each image channel separately.

Let's use an image with a embosscomposite kernel of kernel and schar edge detection tajmahal.jpg:

im = misc.imread('../images/tajmahal.jpg')/255 # scale each pixel value in [0,1]print(np.max(im))print(im.shape)emboss_kernel = np.array([[-2,-1,0],[-1,1,1],[0,1,2]])edge_schar_kernel = np.array([[ -3-3j, 0-10j, +3 -3j], [-10+0j, 0+ 0j, +10+0j], [ -3+3j, 0+10j, +3 +3j]])im_embossed = np.ones(im.shape)im_edges = np.ones(im.shape)for i in range(3):    im_embossed[...,i] = np.clip(signal.convolve2d(im[...,i], emboss_kernel, mode='same', boundary="symm"),0,1)for i in range(3): ...

Convolution with SciPy ndimage.COLVEL

Using scipy.ndimage.convolve(), we can directly sharpen the RGB image (we don't have to apply convolution to each image channel separately).

Use victoria_memorial.pngthe image with sharpenkernel and embosscore:

im = misc.imread('../images/victoria_memorial.png').astype(np.float) # read as float
print(np.max(im))
sharpen_kernel = np.array([0, -1, 0, -1, 5, -1, 0, -1, 0]).reshape((3, 3, 1))
emboss_kernel = np.array(np.array([[-2,-1,0],[-1,1,1],[0,1,2]])).reshape((3, 3, 1))
im_sharp = ndimage.convolve(im, sharpen_kernel, mode='nearest')
im_sharp = np.clip(im_sharp, 0, 255).astype(np.uint8) # clip (0 to 255) and convert to unsigned int
im_emboss = ndimage.convolve(im, emboss_kernel, mode='nearest')
im_emboss = np.clip(im_emboss, 0, 255).astype(np.uint8)
pylab.figure(figsize=(10,15))
pylab.subplot(311), pylab.imshow(im.astype(np.uint8)), pylab.axis('off')
pylab.title('Original Image', size=25)
pylab.subplot(312), pylab.imshow(im_sharp), pylab.axis('off')
pylab.title('Sharpened Image', size=25)
pylab.subplot(313), pylab.imshow(im_emboss), pylab.axis('off')
pylab.title('Embossed Image', size=25)
pylab.tight_layout()
pylab.show()

You will get these convolved images:

The sharpened image looks like this:

The relief image looks like this:

Correlation and convolution

Correlation is very similar to the convolution operation in that it also takes an input image and another kernel and traverses the kernel window through the input by computing a weighted combination of pixel neighborhood values ​​and kernel values, and produces an output image.

The only difference is that, unlike correlation, convolution flips the kernel twice (with respect to the horizontal and vertical axes) before calculating the weighted combination.

The next diagram mathematically describes the difference between correlation and convolution on an image:

SciPy signals modulecorrelated2d() ...

Template matching based on cross-correlation between images and templates

In this example, we will use cross-correlation of the eye template image (using the kernel of the image for cross-correlation), and the position of the eyes in the raccoon face image will look like this:

face_image = misc.face(gray=True) - misc.face(gray=True).mean()
template_image = np.copy(face_image[300:365, 670:750]) # right eye
template_image -= template_image.mean()
face_image = face_image + np.random.randn(*face_image.shape) * 50 # add random noise
correlation = signal.correlate2d(face_image, template_image, boundary='symm', mode='same')
y, x = np.unravel_index(np.argmax(correlation), correlation.shape) # find the match
fig, (ax_original, ax_template, ax_correlation) = pylab.subplots(3, 1, figsize=(6, 15))
ax_original.imshow(face_image, cmap='gray')
ax_original.set_title('Original', size=20)
ax_original.set_axis_off()
ax_template.imshow(template_image, cmap='gray')
ax_template.set_title('Template', size=20)
ax_template.set_axis_off()
ax_correlation.imshow(correlation, cmap='afmhot')
ax_correlation.set_title('Cross-correlation', size=20)
ax_correlation.set_axis_off()
ax_original.plot(x, y, 'ro')
fig.show()

You've used red dots to mark the locations with the largest cross-correlation values ​​(best matches to the template):

Here is the template:

Applying cross-correlation results in the following output:

As can be seen from the previous image, one of the raccoon's eyes in the input image has the highest correlation with the eye template image.

Summarize

We discussed some important concepts mainly related to 2D DFT and its related applications in image processing, such as frequency domain filtering, with extensive examples using scikit-image numpy.fft, scipy.fftpack, signaland modules.ndimage

Hopefully you are now clear about sampling and quantization, two important image forming techniques. We have seen Python implementations of 2D DFT and FFT algorithms, as well as image denoising and restoration, correlation and convolution of DFT in image processing, the application of convolution in filter design, and the application of correlation in template matching.

You should now be able to write Python code to execute. . .

question

Questions are as follows:

  1. Use Gaussian LPF to achieve downsampling and anti-aliasing (Tip: apply Gaussian filter first, then filter every other row and column to reduce the house grayscale image four times. Before downsampling, compare the output with and without LPF preprocessing image)

  2. Upsample the image using FFT: first lenadouble the size of the grayscale image by filling zero rows/columns at every alternating position, then use FFT, then LPF, then IFFT to obtain the output image. Why does it work?

  3. Try applying Fourier transform and image reconstruction with color (RGB) images. (Tip: Apply FFT to each channel separately).

  4. Use mathematical methods and 2D kernel examples to illustrate that the Fourier transform of a Gaussian kernel is another Gaussian kernel.

  5. Generate images with correlation and convolution using lenaimages and asymmetric ripple kernels. Display output images are different. Now, flip the kernel twice (upside down and left and right) and apply the correlation with the flipped kernel. Whether the output image is the same as the image obtained by convolution with the original kernel

further reading

Here are various references from various sources:

3. Convolution and frequency domain filtering

In this chapter, we continue our discussion of 2D convolutions and see how to do convolutions faster in the frequency domain (using the basic concepts of the convolution theorem). We will understand the basic difference between correlation and convolution* with an example on an image . We will also describe an example from SciPy that will show how to use cross-correlation to find the location of a specific pattern in an image with a template image. Finally, we will describe several filtering techniques in the frequency domain (which can be implemented using *, kernel convolutions, such as box kernels or Gaussian kernels), such as high-pass, low-pass, band-pass and band-stop filters, and how to use them with examples Python libraries implement them. We will give examples of how some filters can be used for image denoising (e.g., band-rejector notchfilters to remove periodic noise in images, or inverse or Wiener filters to remove Gaussian/motion blur kernel blurred images).

**The topics covered in this chapter are as follows:

  • Convolution theorem and frequency domain Gaussian blur
  • Frequency domain filtering (using SciPy ndimagemodule sum scikit-image)

Convolution theorem and frequency domain Gaussian blur

In this section, we will see more applications of convolving images using Python modules such as scipy signaland . ndimageLet's start with the convolution theorem and see how the convolution operation becomes easier in the frequency domain.

Application of the convolution theorem

The convolution theorem states that convolution in the image domain is equivalent to simple multiplication in the frequency domain:

**

The following figure shows the application of Fourier transform:

The figure below shows the basic steps of frequency domain filtering. We take as input the original image F*,* and the kernel (mask or downgrade/enhancement function). First, the two inputs need to be converted to the frequency domain using DFT, and then convolution is applied, which according to the convolution theorem is just (element-wise) multiplication. This will output the convolved image in the frequency domain, on which we need to apply IDFT to obtain the reconstructed image (with some degradation or enhancement on the original image):

Now let's see a demonstration of the theorem on some images and some Python library functions. We need to import all required libraries like in the previous chapter.

Frequency domain Gaussian blur filter based on numpy-fft

The following code block shows how to use the convolution theorem and numpy fftapply a Gaussian filter in the frequency domain (since in the frequency domain it is just multiplication):

pylab.figure(figsize=(20,15))pylab.gray() # show the filtered result in grayscaleim = np.mean(imread('../images/lena.jpg'), axis=2)gauss_kernel = np.outer(signal.gaussian(im.shape[0], 5), signal.gaussian(im.shape[1], 5))freq = fp.fft2(im)assert(freq.shape == gauss_kernel.shape)freq_kernel = fp.fft2(fp.ifftshift(gauss_kernel))convolved = freq*freq_kernel # by the convolution theorem, simply multiply in the frequency domainim1 = fp.ifft2(convolved).realpylab.subplot(2,3,1), pylab.imshow(im), pylab.title('Original ...

Gaussian kernel in frequency domain

In this section we will see what a Gaussian kernel looks like in the frequency domain in 2D and 3D plots.

Gaussian LPF kernel spectrum in two dimensions

The next code block shows how to logplot the spectrum of a 2D Gaussian kernel using a transformation:

im = rgb2gray(imread('../images/lena.jpg'))
gauss_kernel = np.outer(signal.gaussian(im.shape[0], 1), signal.gaussian(im.shape[1], 1))
freq = fp.fft2(im)
freq_kernel = fp.fft2(fp.ifftshift(gauss_kernel))
pylab.imshow( (20*np.log10( 0.01 + fp.fftshift(freq_kernel))).real.astype(int), cmap='coolwarm') # 0.01 is added to keep the argument to log function always positive
pylab.colorbar()
pylab.show()

The screenshot below shows the output of the previous code, with a color bar. Since the Gaussian kernel is a low-pass filter, its spectrum has higher values ​​for the center frequency (it allows more low-frequency values), and gradually decreases as you move away from the center to higher frequency values:

The next screenshot shows the spectrum of a three-dimensional Gaussian kernel along the response axis, with and without logarithmic scale. It can be seen that the DFT of the Gaussian kernel is another Gaussian kernel. The Python code for three-dimensional plotting is left as an exercise to the reader (question 3, with hints).

Gaussian LPF kernel spectrum in 3D

The horizontal plane represents the frequency plane and the vertical axis of the Gaussian kernel response in the frequency domain, without and with the logarithmic axis:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-rMvCv8la-1681961321688)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/dbb818fc-845c-4dc5-b7e8-09d2716b7ebd.png)]

Frequency domain Gaussian blur filter with scipy signal

The following code block shows how to fftconvolve()run a convolution in the frequency domain using functions from the SciPy Signal module (internally only via the multiplication and convolution theorems):

im = np.mean(misc.imread('../images/mandrill.jpg'), axis=2) print(im.shape)# (224, 225)gauss_kernel = np.outer(signal.gaussian(11, 3), signal.gaussian(11, 3)) # 2D Gaussian kernel of size 11x11 with σ = 3im_blurred = signal.fftconvolve(im, gauss_kernel, mode='same')fig, (ax_original, ax_kernel, ax_blurred) = pylab.subplots(1, 3, figsize=(20,8))ax_original.imshow(im, cmap='gray')ax_original.set_title('Original', size=20)ax_original.set_axis_off()ax_kernel.imshow(gauss_kernel) ...

Comparing runtimes of SciPy convolve() and fftconvolve() with Gaussian blur kernel

We can use the Python timeitmodule to compare the runtime of image domain and frequency domain convolution functions. Since frequency domain convolution involves a single matrix multiplication rather than a series of sliding window arithmetic calculations, it is expected to be much faster. The following code compares runtime:

im = np.mean(misc.imread('../images/mandrill.jpg'), axis=2)
print(im.shape)
# (224, 225)
gauss_kernel = np.outer(signal.gaussian(11, 3), signal.gaussian(11, 3)) # 2D Gaussian kernel of size 11x11 with σ = 3
im_blurred1 = signal.convolve(im, gauss_kernel, mode="same")
im_blurred2 = signal.fftconvolve(im, gauss_kernel, mode='same')
def wrapper_convolve(func):
    def wrapped_convolve():
        return func(im, gauss_kernel, mode="same")
    return wrapped_convolve
wrapped_convolve = wrapper_convolve(signal.convolve)
wrapped_fftconvolve = wrapper_convolve(signal.fftconvolve)
times1 = timeit.repeat(wrapped_convolve, number=1, repeat=100)
times2 = timeit.repeat(wrapped_fftconvolve, number=1, repeat=100)

The following code block displays the original Mandrill image and the blurred image using these two functions:

pylab.figure(figsize=(15,5))
pylab.gray()
pylab.subplot(131), pylab.imshow(im), pylab.title('Original Image',size=15), pylab.axis('off')
pylab.subplot(132), pylab.imshow(im_blurred1), pylab.title('convolve Output', size=15), pylab.axis('off')
pylab.subplot(133), pylab.imshow(im_blurred2), pylab.title('ffconvolve Output', size=15),pylab.axis('off')

The screenshot below shows the output of the previous code. As expected, the convolve()and fftconvolve()functions both produce the same blurry output image:

The code below visualizes the difference between runtimes. Each function has been run 100 times on the same input image with the same Gaussian kernel, and then a boxplot of the time taken by each function is plotted:

data = [times1, times2]
pylab.figure(figsize=(8,6))
box = pylab.boxplot(data, patch_artist=True) #notch=True,
colors = ['cyan', 'pink']
for patch, color in zip(box['boxes'], colors):
    patch.set_facecolor(color)
pylab.xticks(np.arange(3), ('', 'convolve', 'fftconvolve'), size=15)
pylab.yticks(fontsize=15)
pylab.xlabel('scipy.signal convolution methods', size=15)
pylab.ylabel('time taken to run', size = 15)
pylab.show()

The screenshot below shows the output of the previous code. As can be seen, fftconvolve()the average runs faster:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-G79G1rP9-1681961321688)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/8472af49-4dba-4ed8-97eb-9f853ad28a9c.png)]

Frequency domain filtering (HPF, LPF, BPF and notch filters)

If we remember that in the image processing pipeline described in Chapter 1 , starting with image processing , the next step after image acquisition is image preprocessing. Images are often corrupted by random changes in brightness and lighting, or have poor contrast, making them unusable and requiring enhancement. This is where filters are used.

What is a filter?

Filtering refers to transforming pixel intensity values ​​to reveal certain image characteristics, such as:

  • Enhancement : This image feature increases contrast
  • Smoothing : This image feature removes noise
  • Template matching : This image feature detects known patterns

The filtered image is described by a discrete convolution, and the filter is described by an nxn discrete convolution mask.

High Pass Filter (HPF)

This filter only allows high frequencies from the frequency domain representation of the image (obtained via DFT) and blocks all low frequencies beyond the cutoff value. The image is reconstructed using inverse DFT, and since high-frequency components correspond to edges, details, noise, etc., HPFs tend to extract or enhance them. The next few sections will demonstrate how to implement HPF using numpy, scipyand scikit-imagedifferent functions in the library and the impact of HPF on images.

We can implement HPF on images by following these steps:

  1. scipy.fftpack fft2Perform a 2D FFT using
  2. Only high frequency components are retained (removed...

Signal-to-noise ratio as a function of frequency cutoff

The code block below shows how to plot the signal-to-noise ratio * ( change in signal-to-noise ratio :*) with the cutoff frequency (F) of the HPF

pylab.plot(lbs, snrs_hp, 'b.-')
pylab.xlabel('Cutoff Frequency for HPF', size=20)
pylab.ylabel('SNR', size=20)
pylab.show()

The following screenshot shows how the SNR of the output image decreases as the HPF cutoff frequency increases:

Low pass filter (LPF)

This filter only allows low frequencies from the frequency domain representation of the image (obtained using DFT) and blocks all high frequencies above the cutoff value. The image is reconstructed using inverse DFT, and since high-frequency components correspond to edges, details, noise, etc., LPF tends to remove these. The next few sections will demonstrate how to implement LPF using numpy, scipyand scikit-imagedifferent functions in the library and the impact of LPF on images.

LPF with scipy ndimage and numpy fft

numpy fftThe module's fft2()functionality can also be used to run FFTs on images. The scipy ndimagemodule provides a range of functions for applying LPF to images in the frequency domain. The next section demonstrates one of these filters (ie fourier_gaussian().

Fourier-Gaussian filter

This function from the scipy ndimagemodule implements a multidimensional Gaussian Fourier filter. The frequency array is multiplied by the Fourier transform of a Gaussian kernel of a given size.

The next code block demonstrates how to blur a grayscale image using LPF ( weighted average filter) :lena

import numpy.fft as fpfig, (axes1, axes2) = pylab.subplots(1, 2, figsize=(20,10))pylab.gray() # show the result in grayscaleim = np.mean(imread('../images/lena.jpg'), axis=2)freq = fp.fft2(im)freq_gaussian = ndimage.fourier_gaussian(freq, sigma=4)im1 = fp.ifft2(freq_gaussian)axes1.imshow(im), axes1.axis('off'), axes2.imshow(im1.real) # the imaginary part is an artifactaxes2.axis('off')pylab.show()

The following is. . .

LPF with scipy fftpack

We can implement LPF on images by following these steps:

  1. scipy.fftpack fft2Perform a 2D FFT using
  2. Keep only low frequency components (remove high frequency components)
  3. Perform an inverse FFT to reconstruct the image

The code below shows the Python code that implements LPF. As you can see from the next screenshot, the high frequency components correspond more to the average (flat) image information, and as we remove more and more of the high frequency components, the details of the image (e.g. edges) are lost.

For example, if we keep only the first frequency component and discard all other frequency components, in the resulting image obtained after the inverse FFT, we can barely see the rhino, but as we keep higher and higher frequencies, they appear in Becomes prominent in the final image:

from scipy import fftpack
im = np.array(Image.open('../images/rhino.jpg').convert('L'))
# low pass filter
freq = fp.fft2(im)
(w, h) = freq.shape
half_w, half_h = int(w/2), int(h/2)
freq1 = np.copy(freq)
freq2 = fftpack.fftshift(freq1)
freq2_low = np.copy(freq2)
freq2_low[half_w-10:half_w+11,half_h-10:half_h+11] = 0 # block the lowfrequencies
freq2 -= freq2_low # select only the first 20x20 (low) frequencies, block the high frequencies
im1 = fp.ifft2(fftpack.ifftshift(freq2)).real
print(signaltonoise(im1, axis=None))
# 2.389151856495427
pylab.imshow(im1, cmap='gray'), pylab.axis('off')
pylab.show()

The following screenshot shows the output of the above code, i.e. the total output image obtained by applying LPF on the input rhino image, without finer details:

The code block below shows how to plot the spectrum of an image in the logarithmic domain after blocking high frequencies; in other words, only allowing low frequencies:

pylab.figure(figsize=(10,10))
pylab.imshow( (20*np.log10( 0.1 + freq2)).astype(int))
pylab.show()

The following screenshot shows the output of the previous code, i.e. the spectrum obtained after applying LPF on the image:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-Id1GCCM0-1681961321689)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/a8ac3f38-accb-41c2-853e-5644afbe66b7.png)]

The following code block shows the application of LPF on a photographer's grayscale image, with different frequency cutoffs F:

im = np.array(Image.open('../images/cameraman.jpg').convert('L'))
freq = fp.fft2(im)
(w, h) = freq.shape
half_w, half_h = int(w/2), int(h/2)
snrs_lp = []
ubs = list(range(1,25))
pylab.figure(figsize=(12,20))
for u in ubs:
    freq1 = np.copy(freq)
    freq2 = fftpack.fftshift(freq1)
    freq2_low = np.copy(freq2)
    freq2_low[half_w-u:half_w+u+1,half_h-u:half_h+u+1] = 0
    freq2 -= freq2_low # select only the first 20x20 (low) frequencies
    im1 = fp.ifft2(fftpack.ifftshift(freq2)).real
    snrs_lp.append(signaltonoise(im1, axis=None))
    pylab.subplot(6,4,u), pylab.imshow(im1, cmap='gray'), pylab.axis('off')
    pylab.title('F = ' + str(u), size=20)
pylab.subplots_adjust(wspace=0.1, hspace=0)
pylab.show()

The following screenshot shows how LPF detects more and more detail in an image as the cutoff frequency F increases:

Signal-to-noise ratio as a function of cutoff frequency

The following code block shows how to plot the cutoff frequency (F) of the LPF as a function of the signal-to-noise ratio ( SNR) :

snr = signaltonoise(im, axis=None)pylab.plot(ubs, snrs_lp, 'b.-')pylab.plot(range(25), [snr]*25, 'r-')pylab.xlabel('Cutoff Freqeuncy for LPF', size=20)pylab.ylabel('SNR', size=20)pylab.show()

The screenshot below shows how the SNR of the output image decreases as the LPF cutoff frequency increases. The red horizontal line represents the SNR of the original image, drawn for comparison:

With dog filter (BPF)

The Difference of Gaussian ( DoG ) kernel can be used as a BPF, allowing frequencies within a certain band and discarding all other frequencies. The following code block shows how to implement fftconvolve()BPF using the DoG kernel with:

from skimage import img_as_float
im = img_as_float(pylab.imread('../images/tigers.jpeg'))
pylab.figure(), pylab.imshow(im), pylab.axis('off'), pylab.show()
x = np.linspace(-10, 10, 15)
kernel_1d = np.exp(-0.005*x**2)
kernel_1d /= np.trapz(kernel_1d) # normalize the sum to 1
gauss_kernel1 = kernel_1d[:, np.newaxis] * kernel_1d[np.newaxis, :]
kernel_1d = np.exp(-5*x**2)
kernel_1d /= np.trapz(kernel_1d) # normalize the sum to 1
gauss_kernel2 = kernel_1d[:, np.newaxis] * kernel_1d[np.newaxis, :]
DoGKernel = gauss_kernel1[:, :, np.newaxis] - gauss_kernel2[:, :, np.newaxis]
im = signal.fftconvolve(im, DoGKernel, mode='same')
pylab.figure(), pylab.imshow(np.clip(im, 0, 1)), print(np.max(im)),
pylab.show()

The following screenshot shows the output of the previous code block, i.e. the output image obtained using BPF:

Band stop (notch) filter

This filter blocks/rejects some selected frequencies from the frequency domain representation of the image (obtained using DFT), hence the name. As discussed in the next section, it helps in removing periodic noise from images .

Use a notch filter to remove periodic noise from images

In this example, we will first add some periodic (sinusoidal) noise to the parrot image to create a noisy parrot image (this may be caused by interfering with some electrical signals) , and then observe the frequency of the image using the following code block Effect of noise in the domain:

from scipy import fftpack
pylab.figure(figsize=(15,10))
im = np.mean(imread("../images/parrot.png"), axis=2) / 255
print(im.shape)
pylab.subplot(2,2,1), pylab.imshow(im, cmap='gray'), pylab.axis('off')
pylab.title('Original Image')
F1 = fftpack.fft2((im).astype(float))
F2 = fftpack.fftshift( F1 )
pylab.subplot(2,2,2), pylab.imshow( (20*np.log10( 0.1 + F2)).astype(int), cmap=pylab.cm.gray)
pylab.xticks(np.arange(0, im.shape[1], 25))
pylab.yticks(np.arange(0, im.shape[0], 25))
pylab.title('Original Image Spectrum')
# add periodic noise to the image
for n in range(im.shape[1]):
    im[:, n] += np.cos(0.1*np.pi*n)
pylab.subplot(2,2,3), pylab.imshow(im, cmap='gray'), pylab.axis('off')
pylab.title('Image after adding Sinusoidal Noise')
F1 = fftpack.fft2((im).astype(float)) # noisy spectrum
F2 = fftpack.fftshift( F1 )
pylab.subplot(2,2,4), pylab.imshow( (20*np.log10( 0.1 + F2)).astype(int), cmap=pylab.cm.gray)
pylab.xticks(np.arange(0, im.shape[1], 25))
pylab.yticks(np.arange(0, im.shape[0], 25))
pylab.title('Noisy Image Spectrum')
pylab.tight_layout()
pylab.show()

The screenshot below shows the output of the previous code block. It can be seen that in the spectrum near u=175, the periodic noise on the horizontal line becomes more prominent:

Now, let's design a bandstop/bandstop (notch) filter that eliminates the noise-generating frequencies by setting the corresponding frequency components to zero in the next code block:

F2[170:176,:220] = F2[170:176,230:] = 0 # eliminate the frequencies most likely responsible for noise (keep some low frequency components)
im1 = fftpack.ifft2(fftpack.ifftshift( F2 )).real
pylab.axis('off'), pylab.imshow(im1, cmap='gray'), pylab.show()

The screenshot below shows the output of the previous code block, i.e. the image recovered by applying a notch filter. As can be seen, the original image looks sharper than the restored image because some of the real frequencies from the original image are also rejected by the band stop filter along with the noise:

Image restoration

In image restoration, modeling degradation . This can (largely) eliminate the effects of degradation. The challenge is the loss of information and noise. The following figure shows the basic image degradation model:

In the next few sections, we describe two degradation models (i.e., inverse and Wiener filters).

FFT deconvolution and inverse filtering

Given a blurred image with a known (assumed) blur kernel, a typical image processing task is to recover (at least approximately) the original image. This particular task is called deconvolution . One of the simple filters that can be applied in the frequency domain to achieve this is the inverse filter which we will discuss in this section. lenaLet's first blur a grayscale image using Gaussian blur using the following code :

im = 255*rgb2gray(imread('../images/lena.jpg'))
gauss_kernel = np.outer(signal.gaussian(im.shape[0], 3),
signal.gaussian(im.shape[1], 3))
freq = fp.fft2(im)
freq_kernel = fp.fft2(fp.ifftshift(gauss_kernel)) # this is our H
convolved = freq*freq_kernel # by convolution theorem
im_blur = fp.ifft2(convolved).real
im_blur = 255 * im_blur / np.max(im_blur) # normalize

Now we can use the inverse filter (using the same one H) on the blurred image to restore the original image. The following code block demonstrates how to do this:

epsilon = 10**-6
freq = fp.fft2(im_blur)
freq_kernel = 1 / (epsilon + freq_kernel) # avoid division by zero
convolved = freq*freq_kernel
im_restored = fp.ifft2(convolved).real
im_restored = 255 * im_restored / np.max(im_restored)
print(np.max(im), np.max(im_restored))
pylab.figure(figsize=(10,10))
pylab.gray()
pylab.subplot(221), pylab.imshow(im), pylab.title('Original image'), pylab.axis('off')
pylab.subplot(222), pylab.imshow(im_blur), pylab.title('Blurred image'), pylab.axis('off')
pylab.subplot(223), pylab.imshow(im_restored), pylab.title('Restored image with inverse filter'), pylab.axis('off')
pylab.subplot(224), pylab.imshow(im_restored - im), pylab.title('Diff restored & original image'), pylab.axis('off')
pylab.show()

The screenshot below shows the output. It can be seen that although the inverse filter deblurs the blurred image, there is still some information loss:

The following screenshots show the spectra of the inverse kernel (HPF), original lenaimage, Gaussian LPF blurred lenaimage, and recovered image in logarithmic scale, respectively. The Python code is left to you as an exercise (3):

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-UeJJLIya-1681961321692)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/6748ffbe-99f2-4dfd-95d2-dfc7b4963bd9.png)]

If the input image is noisy, the inverse filter (HPF) performs poorly because the noise is also enhanced in the output image (see question 4 in the questions section).

Similarly, we can use an inverse filter to deblur an image blurred with a known motion blur kernel. The code remains the same; only the kernel has changed, as shown in the code below. Note that we need to create a zero-padded kernel of size equal to the size of the original image before we can apply the convolution in the frequency domain (using np.pad(); the details are left as an exercise to you):

kernel_size = 21 # a 21 x 21 motion blurred kernel
mblur_kernel = np.zeros((kernel_size, kernel_size))
mblur_kernel[int((kernel_size-1)/2), :] = np.ones(kernel_size)
mblur_kernel = mblur_kernel / kernel_size
# expand the kernel by padding zeros

The following screenshot shows the spectrum of the previously defined motion blur kernel:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-QeEvXDMY-1681961321692)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/5e5b2849-a619-43ff-b99a-f22d5a8070b4.png)]

The following screenshot shows the output of the inverse filter with a motion blurred image:

Image deconvolution with Wiener filter

In the previous section we have seen how to use an inverse filter to obtain an (approximately) original image from a blurred image (with a known blur kernel). Another important task in image processing is to remove noise from corrupted signals. This is also called image recovery . The following code block shows how to use scikit-image restorationthe module's unsupervised Wiener filter for image denoising and deconvolution:

from skimage import color, data, restorationim = color.rgb2gray(imread('../images/elephant_g.jpg'))from scipy.signal import convolve2d as conv2n = 7psf = np.ones((n, n)) / n**2im1 = conv2(im, psf, 'same')im1 += 0.1 * astro.std() * np.random.standard_normal(im.shape) ...

Image denoising based on FFT

The next example is taken from http://www.scipy-lectures.org/intro/scipy/auto_examples/solutions/plot_fft_image_denoise.html . This example demonstrates how to first denoise an image by binning high-frequency Fourier elements using LPF and FFT. Let's first display a noisy grayscale image using the following code block:

im = pylab.imread('../images/moonlanding.png').astype(float)
pylab.figure(figsize=(10,10))
pylab.imshow(im, pylab.cm.gray), pylab.axis('off'), pylab.title('Original image'), pylab.show()

The following screenshot shows the output of the previous code block, which is the original noisy image:

The following code block displays the spectrum of a noisy image:

from scipy import fftpack
from matplotlib.colors import LogNorm
im_fft = fftpack.fft2(im)
def plot_spectrum(im_fft):
    pylab.figure(figsize=(10,10))
    pylab.imshow(np.abs(im_fft), norm=LogNorm(vmin=5), cmap=pylab.cm.afmhot), pylab.colorbar()
pylab.figure(), plot_spectrum(fftpack.fftshift(im_fft))
pylab.title('Spectrum with Fourier transform', size=20)

The following screenshot shows the output of the previous code, which is the Fourier spectrum of the original noise image:

Filters in FFT

The following code block shows how to reject a set of high frequencies and implement LPF to attenuate the noise in the image (corresponding to the high frequency components):

# Copy the original spectrum and truncate coefficients.# Define the fraction of coefficients (in each direction) to keep askeep_fraction = 0.1im_fft2 = im_fft.copy()# Set r and c to the number of rows and columns of the array.r, c = im_fft2.shape# Set all rows to zero with indices between r*keep_fraction and r*(1-keep_fraction)im_fft2[int(r*keep_fraction):int(r*(1-keep_fraction))] = 0# Similarly with the columnsim_fft2[:, int(c*keep_fraction):int(c*(1-keep_fraction))] = 0pylab.figure(), plot_spectrum(fftpack.fftshift(im_fft2)),pylab.title('Filtered Spectrum') ...

Reconstruct the final image

The following code block shows how to use IFFT to reconstruct an image from filtered Fourier coefficients:

# Reconstruct the denoised image from the filtered spectrum, keep only the real part for display.
im_new = fp.ifft2(im_fft2).real
pylab.figure(figsize=(10,10)), pylab.imshow(im_new, pylab.cm.gray),
pylab.axis('off')
pylab.title('Reconstructed Image', size=20)

The screenshot below shows the output of the previous code, which is a cleaner output image obtained from the original noisy image through frequency domain filtering:

Summarize

In this chapter, we discuss some important concepts mainly related to 2D convolution and its related applications in image processing, such as spatial filtering. We also discuss several different frequency domain filtering techniques, illustrated with multiple examples of the scikit-image numpy fft, scipy, fftpack, signaland modules. ndimageWe first introduce the convolution theorem and its application in frequency domain filtering, various frequency domain filters, such as LPF, HPF and notch filters, and finally introduce deconvolution and its application in designing image restoration filters (such as inverse filtering filter and Wiener filter).

After completing this chapter, the reader should be able to write Python code. . .

question

  1. Use mpl_toolkits.mplot3dthe module to plot the image in 3D, the Gaussian kernel and the spectrum of the image obtained after convolution in the frequency domain (the output should resemble the surface shown in the sections). (Tip: np.meshgrid()Functions will surfacecome in handy in drawings). Repeat this exercise for the reverse filter as well.
  2. Add some random noise to lenathe image, blur the image with a Gaussian kernel, and then try to restore the image using an inverse filter, as shown in the corresponding example. What happened and why?
  3. fftconvolve()Apply a Gaussian blur to a color image in the frequency domain using SciPy Signal's function.
  4. Using the and functions of SciPy's ndimagemodules , LPFs with cuboidal and ellipsoidal kernels*,* were applied on the image in the frequency domain, respectively.fourier_uniform()fourier_ellipsoid()

further reading

4. Image enhancement

In this chapter, we will discuss some of the most basic tools in image processing, such as mean/median filtering and histogram equalization, which are still among the most powerful. The purpose of image enhancement is to improve image quality or make specific features appear more prominent. These techniques are more general and do not assume a strong model of the degradation process (unlike image restoration). Some examples of image enhancement techniques are contrast stretching, smoothing, and sharpening. We will describe the basic concepts and implementation of these techniques using Python library functions PILand libraries. We will get acquainted with simple and still popular methods.scikit-imagescipy ndimage

We'll start with point-by-point intensity transformations, then discuss contrast stretching, thresholding, halftoning, and dithering algorithms, and the corresponding Python library functions. We will then discuss different histogram processing techniques such as histogram equalization (its global and adaptive versions) and histogram matching. Then, several image denoising techniques will be described. First, some linear smoothing techniques, such as averaging filters and Gaussian filters, will be described, followed by relatively new nonlinear noise smoothing techniques, such as median filtering, bilateral filtering, and non-local mean filtering, and how to implement them in Python . Finally, different image operations with mathematical morphology and their applications, as well as their implementation, will be described.

The topics covered in this chapter are as follows:

  • Pointwise intensity transformation – pixel transformation
  • Histogram processing, histogram equalization, histogram matching
  • Linear noise smoothing (averaging filter)
  • Nonlinear noise smoothing (median filter)

Pointwise intensity transformation – pixel transformation

As described in Chapter 1 , Getting Started with Image Processing , the point-wise intensity transformation operation applies a transfer function T to each pixel f(x,y) of the input image to generate the corresponding pixel in the output image. The transformation can be expressed as g(x,y)=T(f(x,y)) or equivalently as s=T(r) , where r is the grayscale of the pixel in the input image and s is the grayscale of the same pixel in the output image. Convert grayscale. This is a memoryless operation, the output intensity at position ( x , y ) only depends on the input intensity at the same point. Pixels of the same intensity get the same transformation. This brings no new information...

Logarithmic transformation

Logarithmic transformation is very useful when we need to compress or stretch a range of gray levels in an image; for example, to display the Fourier spectrum (where the DC component has much higher values ​​than other components, so without logarithmic transformation it is almost always is that other frequency components cannot be seen). The point transformation function of logarithmic transformation is in the general form , where c is a constant*. *

Let's implement a histogram of the color channels of the input image:

def plot_image(image, title=''):
    pylab.title(title, size=20), pylab.imshow(image)
    pylab.axis('off') # comment this line if you want axis ticks

def plot_hist(r, g, b, title=''):
    r, g, b = img_as_ubyte(r), img_as_ubyte(g), img_as_ubyte(b)
    pylab.hist(np.array(r).ravel(), bins=256, range=(0, 256), color='r', alpha=0.5)
    pylab.hist(np.array(g).ravel(), bins=256, range=(0, 256), color='g', alpha=0.5)
    pylab.hist(np.array(b).ravel(), bins=256, range=(0, 256), color='b', alpha=0.5)
    pylab.xlabel('pixel value', size=20), pylab.ylabel('frequency', size=20)
    pylab.title(title, size=20)

im = Image.open("../images/parrot.png")
im_r, im_g, im_b = im.split()
pylab.style.use('ggplot')
pylab.figure(figsize=(15,5))
pylab.subplot(121), plot_image(im, 'original image')
pylab.subplot(122), plot_hist(im_r, im_g, im_b,'histogram for RGB channels')
pylab.show()

The following screenshot shows the output of the original image color channel histogram before applying the logarithmic transformation:

Now, let us use point()the functions of the PIL image module to apply a logarithmic transformation and have an effect on the transformation of the histogram of the different color channels of the RGB image:

im = im.point(lambda i: 255*np.log(1+i/255))
im_r, im_g, im_b = im.split()
pylab.style.use('ggplot')
pylab.figure(figsize=(15,5))
pylab.subplot(121), plot_image(im, 'image after log transform')
pylab.subplot(122), plot_hist(im_r, im_g, im_b, 'histogram of RGB channels log transform')
pylab.show()

The output shows how the histogram is compressed for different color channels:

power law transformation

As we have seen, using the PIL function from Chapter 1 , start point transformation in image processing (the transfer function is of the general form, s = T (r) = cr γ , where c is a constant) on a grayscale image Continuing, this time let's apply a power law transformation to an RGB color image with and then visualize the effect of the transformation on the color channel histogram:point()scikit-image

im = img_as_float(imread('../images/earthfromsky.jpg'))gamma = 5im1 = im**gammapylab.style.use('ggplot')pylab.figure(figsize=(15,5))pylab.subplot(121), plot_hist(im[...,0], im[...,1], im[...,2], 'histogram for RGB channels (input)')pylab.subplot(122), plot_hist(im1[...,0], im1[...,1], im1[...,2], 'histogram for RGB channels ...

contrast stretch

The contrast stretching operation takes a low-contrast image as input and stretches a narrower range of intensity values ​​to span a desired wider range of values ​​in order to output a high-contrast output image, thus enhancing the contrast of the image. It is simply a linear scaling function applied to the pixel values ​​of the image, so the image enhancement is less drastic (compared to its more complex counterpart, histogram equalization, described later). The following screenshot shows the point transform function of contrast stretching:

As you can see from the previous screenshot, before stretching can be performed, upper and lower pixel value limits (over which the image will be normalized) need to be specified (for example, for grayscale images, the limits are usually set to 0 and 255, so that the output image spans the entire range of available pixel values). We need to find a suitable value m from the CDF of the original image . The Contrast Stretch Transform produces a brighter image than the original image by darkening the levels below the value m (in other words, stretching the values ​​to the lower limit), and brightening the levels before the value m (stretching the values ​​to the upper limit). Image with higher contrast in the original image. The following sections describe how to implement contrast stretching using the PIL library.

Using PIL as a point operation

Let's first load a color RGB image and split it across color channels to visualize the histogram of pixel values ​​for different color channels:

im = Image.open('../images/cheetah.png')im_r, im_g, im_b, _ = im.split()pylab.style.use('ggplot')pylab.figure(figsize=(15,5))pylab.subplot(121)plot_image(im)pylab.subplot(122)plot_hist(im_r, im_g, im_b)pylab.show()

The screenshot below shows the output of the previous code block. As can be seen, the input cheetah image is a low-contrast image because the color channel histogram is concentrated within a specific value range (right-skewed) rather than being distributed over all possible pixel values:

The contrast stretching operation stretches overly concentrated grayscale. . .

Using PIL image enhancement module

ImageEnhanceModules can also be used for contrast stretching. The following code block shows how to enhance()enhance the contrast of the same input image using methods from the Contrast object:

contrast = ImageEnhance.Contrast(im)
im1 = np.reshape(np.array(contrast.enhance(2).getdata()).astype(np.uint8), (im.height, im.width, 4)) 
pylab.style.use('ggplot')
pylab.figure(figsize=(15,5))
pylab.subplot(121), plot_image(im1)
pylab.subplot(122), plot_hist(im1[...,0], im1[...,1], im1[...,2]), pylab.yscale('log',basey=10)
pylab.show()

The output of the code is shown below. It can be seen that the contrast of the input image is enhanced and the color channel histogram is stretched towards the endpoints:

threshold

This is a point operation that creates a binary image from a grayscale image by turning all pixels below a certain threshold into zeros and all pixels above that threshold into ones, as shown in the following screenshot:

If g(x,y) is a threshold version of f(x,y) under some global threshold T , then the following can be applied:

Why do we need a binary image? There are several reasons, for example we might be interested in splitting the image into foreground and background; the image will be printed using a black and white printer (and all...

There is a fixed threshold

The code block below shows how to point()threshold a fixed threshold using PIL functions:

im = Image.open('../images/swans.jpg').convert('L')
pylab.hist(np.array(im).ravel(), bins=256, range=(0, 256), color='g')
pylab.xlabel('Pixel values'), pylab.ylabel('Frequency'),
pylab.title('Histogram of pixel values')
pylab.show()
pylab.figure(figsize=(12,18))
pylab.gray()
pylab.subplot(221), plot_image(im, 'original image'), pylab.axis('off')
th = [0, 50, 100, 150, 200]
for i in range(2, 5):
    im1 = im.point(lambda x: x > th[i])
    pylab.subplot(2,2,i), plot_image(im1, 'binary image with threshold=' + str(th[i]))
pylab.show()

The screenshot below shows the output of the previous code. First, we can see the distribution of pixel values ​​in the input image from:

Furthermore, as can be seen below, the binary images obtained using different grayscale thresholds are not colored correctly, resulting in an artifact known as false contours:

*

When discussing image segmentation, we will discuss several different thresholding algorithms in detail in Chapter 6 , Morphological Image Processing .

halftone

One way to reduce false contour artifacts in thresholding (binary quantization) is to add uniformly distributed white noise to the input image before quantization. Specifically, for each input pixel of the grayscale image, f(x,y) , we add an independent uniform [-128128] random number and then threshold it. This technique is called halftone. The following code block shows an implementation:

im = Image.open('../images/swans.jpg').convert('L')im = Image.fromarray(np.clip(im + np.random.randint(-128, 128, (im.height, im.width)), 0, 255).astype(np.uint8))pylab.figure(figsize=(12,18))pylab.subplot(221), plot_image(im, 'original image (with noise)')th = [0, 50, 100, 150, 200]for i in range(2, 5): im1 = im.point(lambda ...

Floyd-Steinberg jitter with error diffusion

Likewise, to prevent large-scale patterns (e.g. false contours), a deliberately applied form of noise is used to randomize the quantization error. This process is called dithering . The Floyd Steinberg algorithm implements dithering using an error diffusion technique, in other words, it pushes (adds) a pixel's remaining quantization error to neighboring pixels, which are processed later. It unfolds the quantization error as a map of neighboring pixels according to the distribution shown in the following screenshot:

In the previous screenshot, the current pixel is represented by a star (*), and blank pixels represent previously scanned pixels. The algorithm scans the image from left to right and top to bottom. Each time the quantization error is distributed between adjacent pixels (that have not yet been scanned), it quantizes the pixel values ​​in turn without affecting pixels that have already been quantized. Therefore, if multiple pixels have been rounded down, it is more likely that subsequent pixels will be algorithmically rounded up so that the average quantization error is close to zero.

The following screenshot shows the algorithm pseudocode:

The following screenshot shows the output binary image obtained using the Python implementation of the preceding pseudocode; it provides a significant improvement in the quality of the obtained binary image compared to the previous halftoning method:

The code is left as an exercise.

Histogram processing – histogram equalization and matching

Histogram processing techniques provide better methods for changing the dynamic range of pixel values ​​in an image so that its intensity histogram has the desired shape. As we can see, image enhancement by contrast stretching operation is limited since it can only apply a linear scaling function.

Histogram processing techniques can be more powerful by using a non-linear (non-monotonic) transfer function to map input pixel intensities to output pixel intensities. In this section, we will scikit-imagedemonstrate the implementation of two techniques, namely histogram equalization and histogram matching, using the exposure module of the library. . .

Scikit-based image contrast stretching and histogram equalization

Histogram equalization uses a monotonic and nonlinear mapping that redistributes pixel intensity values ​​in the input image so that the output image has a uniform intensity distribution (flat histogram), thereby enhancing the contrast of the image. The screenshot below depicts the transformation function for histogram equalization:

下面的代码块显示了如何使用曝光模块的equalize_hist()功能对 scikit 图像*进行直方图均衡化。*直方图均衡化实现有两种不同的风格:一种是对整个图像进行全局操作,另一种是局部(自适应)操作,通过将图像划分为块并在每个块上运行直方图均衡化来完成:

img = rgb2gray(imread('../images/earthfromsky.jpg'))
# histogram equalization
img_eq = exposure.equalize_hist(img)
# adaptive histogram equalization
img_adapteq = exposure.equalize_adapthist(img, clip_limit=0.03)
pylab.gray()
images = [img, img_eq, img_adapteq]
titles = ['original input (earth from sky)', 'after histogram equalization', 'after adaptive histogram equalization']
for i in range(3):
    pylab.figure(figsize=(20,10)), plot_image(images[i], titles[i])
pylab.figure(figsize=(15,5))
for i in range(3):
    pylab.subplot(1,3,i+1), pylab.hist(images[i].ravel(), color='g'), pylab.title(titles[i], size=15)
pylab.show()

下面的屏幕截图显示了上一个代码块的输出。可以看出,直方图均衡化后,输出图像的直方图变得几乎均匀(以x轴表示像素值,以y轴表示对应的频率),尽管自适应直方图均衡化比全局直方图均衡化更清楚地揭示了图像的细节:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-t2p3OTDq-1681961321697)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw/master/docs/handson-imgproc-py/img/ea138e57-e7c5-4c9e-8625-10d36c704fb2.png)]

以下屏幕截图显示了局部(近似均匀)与自适应(拉伸和分段均匀)直方图均衡化的像素分布变化情况:

以下代码块将使用两种不同的直方图处理技术(即对比度拉伸和直方图均衡化)获得的图像增强与scikit-image进行比较:

import matplotlib
matplotlib.rcParams['font.size'] = 8
def plot_image_and_hist(image, axes, bins=256):
    image = img_as_float(image)
    axes_image, axes_hist = axes
    axes_cdf = axes_hist.twinx()
    axes_image.imshow(image, cmap=pylab.cm.gray)
    axes_image.set_axis_off()
    axes_hist.hist(image.ravel(), bins=bins, histtype='step', color='black')
    axes_hist.set_xlim(0, 1)
    axes_hist.set_xlabel('Pixel intensity', size=15)
    axes_hist.ticklabel_format(axis='y', style='scientific', scilimits=(0, 0))
    axes_hist.set_yticks([])
    image_cdf, bins = exposure.cumulative_distribution(image, bins)
    axes_cdf.plot(bins, image_cdf, 'r')
    axes_cdf.set_yticks([])
    return axes_image, axes_hist, axes_cdf

im = io.imread('../images/beans_g.png')
# contrast stretching
im_rescale = exposure.rescale_intensity(im, in_range=(0, 100), out_range=(0, 255))
im_eq = exposure.equalize_hist(im) # histogram equalization
im_adapteq = exposure.equalize_adapthist(im, clip_limit=0.03) # adaptive histogram equalization

fig = pylab.figure(figsize=(15, 7))
axes = np.zeros((2, 4), dtype = np.object)
axes[0, 0] = fig.add_subplot(2, 4, 1)
for i in range(1, 4):
    axes[0, i] = fig.add_subplot(2, 4, 1+i, sharex=axes[0,0], sharey=axes[0,0])
for i in range(0, 4):
    axes[1, i] = fig.add_subplot(2, 4, 5+i)
axes_image, axes_hist, axes_cdf = plot_image_and_hist(im, axes[:, 0])
axes_image.set_title('Low contrast image', size=20)
y_min, y_max = axes_hist.get_ylim()
axes_hist.set_ylabel('Number of pixels', size=20)
axes_hist.set_yticks(np.linspace(0, y_max, 5))
axes_image, axes_hist, axes_cdf = plot_image_and_hist(im_rescale, axes[:,1])
axes_image.set_title('Contrast stretching', size=20)
axes_image, axes_hist, axes_cdf = plot_image_and_hist(im_eq, axes[:, 2])
axes_image.set_title('Histogram equalization', size=20)
axes_image, axes_hist, axes_cdf = plot_image_and_hist(im_adapteq, axes[:,3])
axes_image.set_title('Adaptive equalization', size=20)
axes_cdf.set_ylabel('Fraction of total intensity', size=20)
axes_cdf.set_yticks(np.linspace(0, 1, 5))
fig.tight_layout()
pylab.show()

下面的屏幕截图显示了前面代码的输出。可以看出,自适应直方图均衡化在使输出图像的细节更清晰方面提供了比直方图均衡化更好的结果:

使用低对比度彩色 cheetah 输入图像,前面的代码生成以下输出:

直方图匹配

直方图匹配是一个过程,其中图像的直方图与另一参考(模板)图像的直方图相匹配。算法如下:

  1. 将为每个图像计算累积直方图,如以下屏幕截图所示。
  2. For any given pixel value x** i [to be adjusted] in the input image, we need to find the corresponding pixel value x** j in the output image by matching the histogram of the input image and the histogram of the template image .
  3. The x * i pixel value has a cumulative histogram value given by G(x** i) . Find a pixel value*xj such that the cumulative distribution value in the reference image, i.e. H(x** j ) is equal to [T28 G (xi . **
    Replace input data value *xi
    with x * j :*

*# Histogram matching of RGB images

For each color channel, matching can be done independently to obtain an output like this:

output image

The Python code to implement this is left as an exercise to the reader (Question 1 of the Questions section).

Linear noise smoothing

Linear (spatial) filtering is a function with a weighted sum of pixel values ​​(in a neighborhood). It is a linear operation on the image and can be used for blurring/noising. Use blurring in a pre-processing step; for example, to remove small (irrelevant) details. Several commonly used linear filters are box filters and Gaussian filters. The filter is implemented with a small (e.g., 3 x 3) kernel (mask) by sliding the mask over the input image and applying the filter function to every possible pixel in the input image to recompute the pixel values ​​(with The center pixel value of the input image corresponding to the mask is replaced by a weighted sum of pixel values, with weights from the mask). . .

PIL finishing

The following sections illustrate how to use ImageFilterthe capabilities of the PIL module for linear noise smoothing; in other words, using linear filters for noise smoothing.

Smoothing using ImageFilter.BLUR

The following shows how to ImageFilterapply blur to remove noisy images using the filtering function of the PIL module. The noise level on the input image is varied to see its effect on the blur filter. The input image for this example uses the popular mandrill (baboon) image; the image is protected by the Creative Commons License (https://creativecommons.org/licenses/by-sa/2.0/ ) and can be found at https://www. flickr.com/photos/uhuru1701/2249220078 and in the SIPI image database: http://sipi.usc.edu/database/database.php?volume=misc&image=10#Top :

i = 1pylab.figure(figsize=(10,25))for prop_noise in np.linspace(0.05,0.3,3):    im = Image.open('../images/mandrill.jpg') # choose 5000 random locations inside ...

Average smoothing using box blur kernel

The following code block shows how to ImageFilter.Kernel()smooth a noisy image using the PIL function and box blur kernels (averaging filters) of size 3 x 3 and 5 x 5:

im = Image.open('../images/mandrill_spnoise_0.1.jpg')
pylab.figure(figsize=(20,7))
pylab.subplot(1,3,1), pylab.imshow(im), pylab.title('Original Image', size=30), pylab.axis('off')
for n in [3,5]:
    box_blur_kernel = np.reshape(np.ones(n*n),(n,n)) / (n*n)
    im1 = im.filter(ImageFilter.Kernel((n,n), box_blur_kernel.flatten()))
    pylab.subplot(1,3,(2 if n==3 else 3))
    plot_image(im1, 'Blurred with kernel size = ' + str(n) + 'x' + str(n))
pylab.suptitle('PIL Mean Filter (Box Blur) with different Kernel size',
size=30)
pylab.show()

The screenshot below shows the output of the previous code. As can be seen, the output image is obtained by convolving a larger-sized box blur kernel with a smoothed noisy image:

Smoothing with Gaussian blur filter

The Gaussian blur filter is also a linear filter, but unlike the simple average filter, it uses a weighted average of the pixels within the kernel window to smooth the pixels (the weight corresponding to adjacent pixels increases exponentially with the distance from adjacent pixels to pixels decline). The following code shows how to use PIL ImageFilter.GaussianBlur()to smooth a noisy image with different values ​​of the kernel radius parameter:

im = Image.open('../images/mandrill_spnoise_0.2.jpg')pylab.figure(figsize=(20,6))i = 1for radius in range(1, 4):    im1 = im.filter(ImageFilter.GaussianBlur(radius))    pylab.subplot(1,3,i), plot_image(im1, 'radius = ' +    str(round(radius,2)))    i += 1pylab.suptitle('PIL ...

Compare the smoothness of box kernel and Gaussian kernel using SciPy ndimage

We can also ndimageapply linear filters to smooth images using SciPy's module functions. The following code snippet demonstrates the results of applying a linear filter to a top drill image degraded by impulsive (salt and pepper) noise:

from scipy import misc, ndimage
import matplotlib.pylab as pylab
im = misc.imread('../images/mandrill_spnoise_0.1.jpg')
k = 7 # 7x7 kernel
im_box = ndimage.uniform_filter(im, size=(k,k,1))
s = 2 # sigma value
t = (((k - 1)/2)-0.5)/s # truncate parameter value for a kxk gaussian kernel with sigma s
im_gaussian = ndimage.gaussian_filter(im, sigma=(s,s,0), truncate=t)
fig = pylab.figure(figsize=(30,10))
pylab.subplot(131), plot_image(im, 'original image')
pylab.subplot(132), plot_image(im_box, 'with the box filter')
pylab.subplot(133), plot_image(im_gaussian, 'with the gaussian filter')
pylab.show()

The screenshot below shows the output of the previous code. It can be seen that the box filter of the same kernel size blurs the output image more than the Gaussian filter of the same size with σ=2:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-WTHtuAFL-1681961321699)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/57e43eca-0ad3-4173-bd1c-6af8e612b4a7.png)]

Nonlinear noise smoothing

Nonlinear (spatial) filters also act on neighborhoods, by sliding a kernel (mask) over the image like a linear filter. However, filtering operations are conditionally based on the values ​​of pixels in a neighborhood, and they usually do not explicitly use coefficients in a sum-of-products manner. For example, noise can be effectively reduced using nonlinear filters, whose basic function is to calculate the median gray value in the neighborhood where the filter is located. This filter is a nonlinear filter because the median calculation is a nonlinear operation. Median filters are very popular because they work well with certain types of random noise (e.g., impulse noise). . .

PIL finishing

The PIL ImageFiltermodule provides a set of functions for nonlinear denoising of images. In this section we will demonstrate some of them with examples.

Use median filter

A median filter replaces each pixel with the median of neighboring pixel values. This filter is great for removing salt and pepper noise, although it can remove small details from the image. We need to rank the neighborhood strengths and then choose the median. Median filtering is resilient to statistical outliers, has low blurriness, and is easy to implement. The following code block shows how to use the functions ImageFilterof the PIL module MedianFilter()to remove salt and pepper noise from a noisy hawthorn image while adding different levels of noise and different sizes of kernel windows for the median filter:

i = 1pylab.figure(figsize=(25,35))for prop_noise in np.linspace(0.05,0.3,3): ...

Use max and min filters

The code below shows how to MaxFilter()remove salt and MinFilter()pepper noise from an image using:

im = Image.open('../images/mandrill_spnoise_0.1.jpg')
pylab.subplot(1,3,1)
plot_image(im, 'Original Image with 10% added noise')
im1 = im.filter(ImageFilter.MaxFilter(size=sz)) 
pylab.subplot(1,3,2), plot_image(im1, 'Output (Max Filter size=' + str(sz) + ')') 
im1 = im1.filter(ImageFilter.MinFilter(size=sz)) 
pylab.subplot(1,3,3), plot_image(im1, 'Output (Min Filter size=' + str(sz) + ')', size=15) 
pylab.show()

The screenshot below shows the output of the previous code block. It can be seen that the maximum and minimum filters respectively have certain effects in removing salt and pepper noise in noisy images:

Image smoothing (denoising) using scikit

scikit-imageThe library also provides a set of nonlinear filters in the recovery module. In the following sections, we will discuss two very useful filters, namely the bilateral and non-local mean filters.

Use bilateral filter

The bilateral filter is an edge-preserving smoothing filter. For this filter, the center pixel is set to a weighted average of the values ​​of some of its neighboring pixels, only pixels whose brightness is roughly similar to the center pixel. In this section, we will see how to scikit-imagedenoise images using the bilateral filter implementation of the package. First, let's create a noisy image from the following grayscale mountain image:

The following code block demonstrates how to use numpy random_noise()functions:

im = color.rgb2gray(img_as_float(io.imread('../images/mountain.png')))
sigma = 0.155
noisy = random_noise(im, var=sigma**2)
pylab.imshow(noisy)

The screenshot below shows a noisy image created by adding random noise to the original image using the previous code:

The following code block demonstrates how to denoise a previously noisy image using a bilateral filter with different values ​​for the parameters σ color and σ space :

pylab.figure(figsize=(20,15))
i = 1
for sigma_sp in [5, 10, 20]:
    for sigma_col in [0.1, 0.25, 5]:
        pylab.subplot(3,3,i)
        pylab.imshow(denoise_bilateral(noisy, sigma_color=sigma_col,
        sigma_spatial=sigma_sp, multichannel=False))
        pylab.title(r'$\sigma_r=$' + str(sigma_col) + r', $\sigma_s=$' + str(sigma_sp), size=20)
        i += 1
pylab.show()

The screenshot below shows the output of the previous code. It can be seen that if the standard deviation is higher, the image becomes blurrier but less noisy. Execution of the previous code block takes a few minutes because the implementation on RGB images is slower:

Use non-native means

Non-local averaging is a texture-preserving nonlinear denoising algorithm. In this algorithm, for any given pixel, the value of the given pixel is set using only the weighted average of the values ​​of neighboring pixels that have similar local neighbors to the pixel of interest. In other words, small patches centered on other pixels are compared to patches centered on the pixel of interest. In this section, we demonstrate this algorithm by denoising noisy parrot images using a non-local mean filter. The function's harguments control the decay of patch weights as a function of the distance between patches. If hit is larger, it allows for smoother smoothing between different patches. The code block below is shown. . .

Smoothing using scipy ndimage

scipyndimage模块提供一个名为percentile_filter()的函数,它是中值滤波器的通用版本。以下代码块演示如何使用此筛选器:

lena = misc.imread('../images/lena.jpg')
# add salt-and-pepper noise to the input image
noise = np.random.random(lena.shape)
lena[noise > 0.9] = 255
lena[noise < 0.1] = 0
plot_image(lena, 'noisy image')
pylab.show()
fig = pylab.figure(figsize=(20,15))
i = 1
for p in range(25, 100, 25):
    for k in range(5, 25, 5):
        pylab.subplot(3,4,i)
        filtered = ndimage.percentile_filter(lena, percentile=p, size=(k,k,1))
        plot_image(filtered, str(p) + ' percentile, ' + str(k) + 'x' + str(k) + ' kernel')
        i += 1
pylab.show()

下面的屏幕截图显示了前面代码的输出。可以看出,在所有百分位滤波器中,具有较小内核大小的中值滤波器(对应于第 50 个百分位)能够最好地去除椒盐噪声,同时丢失图像中尽可能少的细节:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-gCMjzgIF-1681961331098)(null)]

总结

在本章中,我们讨论了不同的图像增强方法,从点变换(例如,对比度拉伸和阈值)开始,然后是基于直方图处理的技术(例如,直方图均衡化和直方图匹配),然后是基于线性(例如,均值和高斯)的图像去噪技术和非线性(例如,中值、双边和非局部均值)滤波器。

到本章结束时,读者应该能够为点变换(例如,负片、幂律变换和对比度拉伸)、基于直方图的图像增强(例如,直方图均衡化/匹配)和图像去噪(例如,均值/中值滤波器)编写 Python 代码。。。

问题

  1. 实现彩色 RGB 图像的直方图匹配。
  2. 使用skimage.filters.rank中的equalize()函数实现局部直方图均衡化,并将其与具有灰度图像的skimage.exposure中的全局直方图均衡化进行比较。
  3. 使用此处描述的算法实现 Floyd Steinberg 误差扩散抖动 https://en.wikipedia.org/wiki/Floyd%E2%80%93Steinberg_dithering 并将灰度图像转换为二值图像。
  4. 使用 PIL 中的ModeFilter()对图像进行线性平滑。什么时候有用?
  5. 显示一幅图像,该图像可以从几个噪声图像中恢复,这些图像是通过简单地取噪声图像的平均值,将随机高斯噪声添加到原始图像中获得的。中位数也有用吗?

further reading

5. Image enhancement based on derivatives

In this chapter, we continue our discussion of image enhancement, the problem of improving the appearance or usefulness of an image. We will mainly focus on spatial filtering techniques for computing image gradients/derivatives, and how these techniques can be used for edge detection in images. First, we'll start with the basic concepts of image gradients using first (partial) derivatives, how to calculate discrete derivatives, and then discuss second order derivatives/the Laplacian operator. We will see how to use them to find edges in images. Next, we will discuss several methods of sharpening/de-sharpening images using the Python image processing library PIL, scikit-imagethe filter module and the SciPy module. ndimageNext, we will see how to use different filters ( sobel, canny, LoGetc.) and convolve them with the image to detect edges in the image. Finally, we'll discuss how to compute a Gaussian/Laplacian image pyramid (using scikit-image) and use the image pyramid to smoothly blend two images. The topics covered in this chapter are as follows:

  • Image derivative gradient, Laplacian
  • Sharpening and unsharpening masks (with PIL, scikit-imageSciPy ndimage)
  • Edge detection using derivatives and filters (Sobel, Canny, LOG, DOG, etc. using PIL,scikit-image
  • Image Pyramid (Gaussian and Laplacian) - Mixed Image (withscikit-image

Image Derivatives - Gradient and Laplacian

We can calculate the (partial) derivatives of digital images using the finite difference method. In this section, let's discuss how to calculate image derivatives, gradients, and Laplacian functions, and why they are useful. Typically, let's start by importing the required libraries as shown in the following code block:

import numpy as npfrom scipy import signal, misc, ndimagefrom skimage import filters, feature, img_as_floatfrom skimage.io import imreadfrom skimage.color import rgb2grayfrom PIL import Image, ImageFilterimport matplotlib.pylab as pylab

Derivatives and gradients

The figure below shows how to calculate the partial derivative of the image I (which is a function f(x,y) ) using finite differences (with forward and central differences, the latter being more precise) , which can be done using convolution and the kernel shown accomplish. The graph also defines the gradient vector, its magnitude (corresponding to the strength of the edge) and its direction (perpendicular to the edge). Locations in the input image where intensity (grayscale values) change sharply correspond to locations where there are peaks/spikes (or troughs) in the first derivative intensity of the image. In other words, peaks in gradient magnitude mark edge locations, and we need to threshold the gradient magnitude to find edges in the image:

The code block below shows how to calculate gradients (as well as magnitude and direction) using the convolution kernel shown previously, taking a grayscale chess image as input. It also plots how the image pixel values ​​and the x component of the gradient vector vary with the y coordinate of the first row in the image ( x=0):

def plot_image(image, title):
 pylab.imshow(image), pylab.title(title, size=20), pylab.axis('off')

ker_x = [[-1, 1]]
ker_y = [[-1], [1]]
im = rgb2gray(imread('../images/chess.png'))
im_x = signal.convolve2d(im, ker_x, mode='same')
im_y = signal.convolve2d(im, ker_y, mode='same')
im_mag = np.sqrt(im_x**2 + im_y**2)
im_dir = np.arctan(im_y/im_x)
pylab.gray()
pylab.figure(figsize=(30,20))
pylab.subplot(231), plot_image(im, 'original'), pylab.subplot(232), plot_image(im_x, 'grad_x')
pylab.subplot(233), plot_image(im_y, 'grad_y'), pylab.subplot(234), plot_image(im_mag, '||grad||')
pylab.subplot(235), plot_image(im_dir, r'$\theta$'), pylab.subplot(236)
pylab.plot(range(im.shape[1]), im[0,:], 'b-', label=r'$f(x,y)|_{x=0}$', linewidth=5)
pylab.plot(range(im.shape[1]), im_x[0,:], 'r-', label=r'$grad_x (f(x,y))|_{x=0}$')
pylab.title(r'$grad_x (f(x,y))|_{x=0}$', size=30)
pylab.legend(prop={
    
    'size': 20})
pylab.show()

The image below shows the output of the previous code block. As can be seen from the figure below, the partial derivatives in the x and y directions detect vertical and horizontal edges in the image respectively. The gradient size shows the edge strength at different locations in the image. Furthermore, if we pick all pixels from the original image corresponding to one row (e.g., row 0), we can see a square wave (corresponding to alternating white and black intensity patterns), while the gradient magnitude of the same set of pixels is in There are spikes (sudden increases/decreases) in intensity, these correspond to (vertical) edges:

Display magnitude and gradient on the same image

In the previous example, the size and direction of the edges were shown in different images. We can create an RGB image and set the R , G , and B values ​​as follows to show both size and orientation in the same image:

Using the same code as in the previous example, we just replace the lower right sub-lot code with the following code:

im = np.zeros((im.shape[0],im.shape[1],3))im[...,0] = im_mag*np.sin(im_ang)im[...,1] = im_mag*np.cos(im_ang)pylab.title(r'||grad||+$\theta$', size=30), pylab.imshow(im), pylab.axis('off')

Then, using the tiger image, we get the output shown. . .

Laplace's

Rosenfeld and Kak have proved that the simplest isotropic derivative operator is the Laplacian operator, whose definition is shown in the figure below. The Laplacian operator approximates the second derivative of the image and detects edges. It is an isotropic (rotation invariant) operator, with zero crossings marking edge positions; we will talk more about this later in this chapter. In other words, where there are peaks/peaks (or troughs) in the first derivative of the input image, there are zero crossings at the corresponding locations of the second derivative of the input image :

Some notes on the Laplacian operator

Let's take a look at the following annotation:

  • is a scalar (unlike gradient, which is a vector)
  • Compute the Laplacian function using a single kernel (mask) (unlike gradients, which usually have two kernels, partial derivatives in x and y directions)
  • As a scalar, it doesn't have any direction, so we lose the direction information
  • is the sum of second-order partial derivatives (the gradient represents a vector consisting of first-order partial derivatives), but higher. . .

The impact of noise on gradient calculations

Derivative filters calculated using the finite difference method are very sensitive to noise. As we saw in the previous chapter, pixels in an image whose intensity values ​​are very different from their neighboring pixels are usually noisy pixels. Generally speaking, the louder the noise and the greater the change in intensity, the stronger the response you will get using the filter. The next code block adds some Gaussian noise to the image to see the effect on the gradient. Let's consider one row of the image again (row 0 to be precise) and let's plot the intensity as {To.T0} x OrthT1 position:

from skimage.util import random_noise
sigma = 1 # sd of noise to be added
im = im + random_noise(im, var=sigma**2)

The image below shows the output of the previous code block after adding some random noise to the chess image. As we can see, adding random noise to the input image has a strong effect on (partial) derivatives and gradient magnitudes; the peaks corresponding to edges are almost indistinguishable from the noise, and the pattern is destroyed:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-LpgBYPcE-1681961321703)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/525c0130-ec59-49fe-a606-aed87d00a8a1.png)]

Smoothing the image before applying the derivative filter should be helpful, as it will remove high frequency components that may be noise and force (noisy) pixels (unlike their neighbors) to look more like their neighbors. So the solution is to first smooth the input image using LPF (like Gaussian filter) and then find the peaks in the smoothed image (using threshold). This gives rise to a logarithmic filter (if we use a second derivative filter), which we will explore later in this chapter.

Sharpen and unsharp mask

The purpose of sharpening is to highlight details in an image or to enhance details that have been blurred. In this section, we'll discuss some techniques and demonstrate several different image sharpening methods with some examples.

Laplacian sharpening

You can use the Laplacian filter to sharpen an image in two steps:

  1. Apply a Laplacian filter to the original input image.
  2. Add the output image obtained in step 1 with the original input image (to obtain a sharpened image). The following code block demonstrates how to implement the above algorithm using scikit-image``filtersthe module's functionality:laplace()
from skimage.filters import laplace
im = rgb2gray(imread('../images/me8.jpg'))
im1 = np.clip(laplace(im) + im, 0, 1)
pylab.figure(figsize=(20,30))
pylab.subplot(211), plot_image(im, 'original image')
pylab.subplot(212), plot_image(im1, 'sharpened image')
pylab.tight_layout()
pylab.show()

Here is the output from the previous code block, the original image, and the sharpened image using the previous algorithm:

Unsharp Mask

Unsharp masking is a technique for sharpening an image, subtracting a blurred version of the image from the image itself. The typical blending formula used by an unsharp mask is as follows: Sharpen = Original + (Original − Blur) × Amount .

这里,数量是一个参数。接下来的几节将演示如何使用 Python 中的 SciPy 函数的ndimage模块实现这一点。

使用 SciPy ndimage 模块

如前所述,我们可以首先模糊图像,然后计算细节图像作为原始图像和模糊图像之间的差值,以实现反锐化掩蔽。锐化后的图像可以作为原始图像和细节图像的线性组合来计算。下图再次说明了该概念:

下面的代码块显示了如何使用 SciPyndimage模块实现灰度图像的反锐化掩模操作(彩色图像也可以这样做,留给读者练习),使用上述概念:

def rgb2gray(im):
'''
 the input image is an RGB image
 with pixel values for each channel in [0,1]
 '''
 return np.clip(0.2989 * im[...,0] + 0.5870 * im[...,1] + 0.1140 * im[...,2], 0, 1)

im = rgb2gray(img_as_float(misc.imread('../images/me4.jpg')))
im_blurred = ndimage.gaussian_filter(im, 5) 
im_detail = np.clip(im - im_blurred, 0, 1)
pylab.gray()
fig, axes = pylab.subplots(nrows=2, ncols=3, sharex=True, sharey=True, figsize=(15, 15))
axes = axes.ravel()
axes[0].set_title('Original image', size=15), axes[0].imshow(im)
axes[1].set_title('Blurred image, sigma=5', size=15), axes[1].imshow(im_blurred)
axes[2].set_title('Detail image', size=15), axes[2].imshow(im_detail)
alpha = [1, 5, 10]
for i in range(3):
 im_sharp = np.clip(im + alpha[i]*im_detail, 0, 1)
 axes[3+i].imshow(im_sharp),  axes[3+i].set_title('Sharpened image, alpha=' + str(alpha[i]), size=15)
for ax in axes:
 ax.axis('off')
fig.tight_layout()
pylab.show()

下面的屏幕截图显示了前面代码块的输出。可以看出,随着α值的增加,输出变得更尖锐:

使用导数和滤波器进行边缘检测(Sobel、Canny 等)

如前所述,构成图像中边缘的像素是图像强度函数中突然快速变化(不连续)的像素,边缘检测的目标是识别这些变化。因此,边缘检测是一种预处理技术,其中输入为 2D(灰度)图像,输出为一组曲线(称为边缘。在边缘检测过程中提取图像的显著特征;使用边缘的图像表示比使用像素的图像表示更紧凑。边缘检测器输出梯度的大小(作为灰度图像),现在,为了获得边缘像素(作为二值图像)。。。

使用偏导数计算梯度幅值

如前所述,使用偏导数的(正向)有限差分近似计算的梯度幅度(可以认为是边缘强度)可用于边缘检测。下面的屏幕截图显示了通过使用与上次相同的代码来计算梯度大小,然后使用斑马输入的灰度图像以[0,1]间隔剪裁像素值而获得的输出:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-YnBEbj1P-1681961321705)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/53d8dce4-c46f-4c1d-b34c-66f1265bd524.png)]

The screenshot below shows a gradient magnitude image. As shown in the image, the edges appear thicker and multiple pixels wider:

To obtain a binary image with each edge one pixel wide, we need to apply a non-maximum suppression algorithm that removes a pixel if it is not a local maximum in the pixel neighborhood along the gradient direction. The implementation of the algorithm is left as an exercise to the reader. The following screenshot shows the output with non-maximum suppression :

non-maximum suppression algorithm

  1. The algorithm first checks the angle (direction) of the edge (output by the edge detector).
  2. If a pixel value is not the maximum on a line tangent to its edge angle, then that pixel value is a candidate to be removed from the edge map.
  3. This is achieved by splitting the edge direction (360) into eight equal intervals (angles of 22.50 degrees). The table below shows the different situations and the actions to be taken:

  1. We can do this by looking at the π/8 range and setting up a series of tangential comparisons with if conditions accordingly.
  2. The effect of edge refinement can be clearly observed (from the previous image). . .

Sobel edge detector based on scikit image

The (first) derivative can be better approximated than using the finite difference method. The Sobel operator shown in the figure below is often used:

The 1/8 term is not included in the standard definition of the Sobel operator because for edge detection purposes it does not make a difference, although the normalization term is required to obtain the gradient values ​​correctly. The next Python code snippet shows how to use the , and functions of the module respectively scikit-imageto find horizontal filters/ vertical edges and calculate the gradient magnitude using the Sobel operator:sobel_h()sobel_y()sobel()

im = rgb2gray(imread('../images/tajmahal.jpg')) # RGB image to gray scale
pylab.gray()
pylab.figure(figsize=(20,18))
pylab.subplot(2,2,1)
plot_image(im, 'original')
pylab.subplot(2,2,2)
edges_x = filters.sobel_h(im) 
plot_image(edges_x, 'sobel_x')
pylab.subplot(2,2,3)
edges_y = filters.sobel_v(im)
plot_image(edges_y, 'sobel_y')
pylab.subplot(2,2,4)
edges = filters.sobel(im)
plot_image(edges, 'sobel')
pylab.subplots_adjust(wspace=0.1, hspace=0.1)
pylab.show()

The screenshot below shows the output of the previous code block. It can be seen that the horizontal and vertical edges of the image are detected by horizontal and vertical Sobel filters, while the gradient magnitude image calculated using the Sobel filter detects edges in both directions:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-EPK1ViYR-1681961321707)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/aa566922-cef1-4b02-82e8-c5cdb67c0df2.png)]

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-Lab6ACBg-1681961321707)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/66695144-8480-4a32-8ae2-3cadb8107214.png)]

Different edge detectors with scikit images – Prewitt, Roberts, Sobel, Scharr and Laplace

There are many different edge detection operators used in image processing algorithms; they are all discrete (first or second order) differential operators that attempt to approximate the gradient of the image intensity function (e.g., the Sobel operator we discussed earlier) . The kernels shown in the figure below are several common kernels used for edge detection. For example, commonly used derivative filters that approximate the first-order image derivative are Sobel, Prewitt, Sharr, and Roberts filters, while derivative filters that approximate the second-order derivative are Laplacian filters:

As scikit-imagestated in the documentation. . .

Canny edge detector based on scikit image

Canny edge detector is a popular edge detection algorithm developed by John F. Canny. This algorithm has several steps:

  1. Smoothing/Noise Reduction : Edge detection operations are sensitive to noise. So, at the beginning, a 5 x 5 Gaussian filter is used to remove noise from the image.

  2. Calculate the magnitude and direction of the gradient: Sobel horizontal and vertical filters are then applied to the image to calculate the edge gradient magnitude and direction for each pixel , as described previously. The calculated gradient angle (direction) is then rounded to one of four angles representing the horizontal, vertical, and two diagonal directions for each pixel.

  3. Non-maximum suppression : In this step, the edges are thinned – any unwanted pixels that may not constitute an edge are removed. To do this, each pixel is checked to see if it is a local maximum in the direction of the gradient in its neighborhood. As a result, a binary image with thin edges is obtained.

  4. Link and hysteresis threshold : This step determines whether all detected edges are strong edges. min_valFor this purpose a pair of (lagged) threshold sums is used max_val. An edge is determined to be an edge with an intensity gradient value above max_val. Make sure that non-edges are min_valedges with intensity gradient values ​​below that, they will be discarded. Edges lying between these two thresholds are classified as edges or non-edges based on their connectivity. If they are connected to "definite edge" pixels, they are considered part of the edge. Otherwise, they are also discarded. This step also removes small pixel noise (assuming edges are long lines).

Finally, the algorithm outputs strong edges in the image. The following code block shows how to scikit-imageimplement the Canny edge detector using:

im = rgb2gray(imread('../images/tiger3.jpg')) 
im = ndimage.gaussian_filter(im, 4)
im += 0.05 * np.random.random(im.shape)
edges1 = feature.canny(im)
edges2 = feature.canny(im, sigma=3)
fig, (axes1, axes2, axes3) = pylab.subplots(nrows=1, ncols=3, figsize=(30, 12), sharex=True, sharey=True)
axes1.imshow(im, cmap=pylab.cm.gray), axes1.axis('off'), axes1.set_title('noisy image', fontsize=50)
axes2.imshow(edges1, cmap=pylab.cm.gray), axes2.axis('off')
axes2.set_title('Canny filter, $\sigma=1$', fontsize=50)
axes3.imshow(edges2, cmap=pylab.cm.gray), axes3.axis('off')
axes3.set_title('Canny filter, $\sigma=3$', fontsize=50)
fig.tight_layout()
pylab.show()

The screenshot below shows the output of the previous code; for an initial Gaussian LPF, Canny filters with different sigma values ​​are used to detect edges. As shown, the lower the sigma value, the less blurry the original image is, so more edges (finer details) can be found:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-79zKxsxM-1681961321707)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/3af8a217-a041-4e2a-ac30-7d3812b0928b.png)]

Log and Dog Filter

The Laplacian filter of Gaussian ( LoG ) is just another linear filter, which is a combination of a Gaussian filter and a Laplacian filter on an image. Since the 2nd derivative is very sensitive to noise, it is always a good idea to remove the noise by smoothing the image before applying the Laplacian to ensure that the noise is not exacerbated. Due to the associative nature of convolution, it can be thought of as taking the 2nd derivative (Laplacian) of a Gaussian filter and then applying the resulting (combined) filter to the image, hence the name LoG. This can be efficiently approximated using the difference of two Gaussians (DoG) with different scales (variances), as shown in the figure below:

The code block below. . .

Log filter with SciPy ndimage module

The functionality of the SciPy ndimagemodule gaussian_laplace()can also be used to implement logging, as shown in the following code block:

img = rgb2gray(imread('../images/zebras.jpg'))
fig = pylab.figure(figsize=(25,15))
pylab.gray() # show the filtered result in grayscale
for sigma in range(1,10):
 pylab.subplot(3,3,sigma)
 img_log = ndimage.gaussian_laplace(img, sigma=sigma)
 pylab.imshow(np.clip(img_log,0,1)), pylab.axis('off')
 pylab.title('LoG with sigma=' + str(sigma), size=20)
pylab.show()

The following images show the input and output images obtained using logarithmic filters with different values ​​of the smoothing parameter σ (standard deviation of the Gaussian filter):

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-ln95WKK2-1681961321708) (https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/9678f9a3-1be5-40aa-a41a-2e054cea3bdd.png)]

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-bMB0ckFw-1681961321708)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/6d264f80-f8be-4739-8dca-3676c89a1d41.png)]

Edge detection based on logarithmic filter

The following describes LOGthe steps for using filters for edge detection:

  • First, the input image needs to be smoothed (by convolution with a Gaussian filter).

  • Then, the smoothed image needs to be convolved with a Laplacian filter, and the output image is obtained as *∇ 2 (I (x, y) G (x, y)) .

  • Finally, the zero crossings of the image obtained in the last step need to be calculated, as shown in the figure below:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-wrw6voa3-1681961321708)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/53da382b-67a9-4633-abdf-8c8acc326b2e.png)]

Marr and Hildreth edge detection algorithm based on zero-crossing calculation

Marr and Hildreth proposed computing zero-crossings in logarithmically convolved images (detecting edges as binary images). Edge pixels can be identified by looking at the sign of a log-smoothed image defined as a binary image. The algorithm for calculating zero crossings is as follows:

  1. First, the log convolution image is converted into a binary image, replacing the pixel values ​​to 1represent positive values ​​and will 0represent negative values
  2. To calculate zero-crossing pixels, we simply look at the boundaries of the non-zero regions in this binary image
  3. The boundary can be found by looking for any non-zero pixel whose nearest neighbor is zero
  4. So, for each pixel, if it is non-zero, its eight neighbors are considered; if any neighboring pixel is zero, the pixel can be identified as an edge

Implementation of this functionality is left as an exercise. The following code block describes the edges of the same zebra image detected by zero crossing:

fig = pylab.figure(figsize=(25,15))
pylab.gray() # show the filtered result in grayscale
for sigma in range(2,10, 2):
 pylab.subplot(2,2,sigma/2)
 result = ndimage.gaussian_laplace(img, sigma=sigma)
 pylab.imshow(zero_crossing(result)) # implement the function zero_crossing() using the above algorithm
 pylab.axis('off')
 pylab.title('LoG with zero-crossing, sigma=' + str(sigma), size=20) 
pylab.show()

The screenshot below shows the output of the previous code block, with edges identified only by zero crossings at different σ scales:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-UJM2BJOB-1681961321708)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/ef188972-07db-4c6b-8bc5-151df730a9a3.png)]

The previous image shows zero crossing with LoG/DoG as edge detector. It should be noted that the zero crossing points form a closed contour .

Find and enhance edges with PIL

The functionality of PIL's ImageFiltermodules filtercan also be used to find and enhance edges in images . The following code block shows UMBC libraryan example taking an image as input:

from PIL.ImageFilter import (FIND_EDGES, EDGE_ENHANCE, EDGE_ENHANCE_MORE)im = Image.open('../images/umbc_lib.jpg')pylab.figure(figsize=(18,25))pylab.subplot(2,2,1)plot_image(im, 'original (UMBC library)')i = 2for f in (FIND_EDGES, EDGE_ENHANCE, EDGE_ENHANCE_MORE): pylab.subplot(2,2,i) im1 = im.filter(f) plot_image(im1, str(f)) i += 1pylab.show()

The following screenshot shows the output of the above code using different edge finding/enhancement filters:

Image Pyramid (Gaussian and Laplacian) - Mixed Image

We can start from the original image and iteratively create smaller images, first by smoothing (using a Gaussian filter to avoid anti-aliasing , and then by subsampling (collectively called reduction ) to build a Gaussian pyramid of the image from the previous level in each iteration of images until the minimum resolution is reached. Image pyramids created in this way are called Gaussian pyramids . By editing the bands individually (e.g., image blending), these functions are suitable for searching over ranges (e.g., template matching), precomputation and image processing tasks. Similarly, the Laplacian pyramid of an image can be done by starting from the smallest size image in the Gaussian pyramid, then by extending (upsampling plus smoothing) the image at that level and starting with the image at the next level of the Gaussian pyramid. This image is constructed by subtracting it from the image and repeating this process until the original image size is reached. In this section we will see how to write python code to calculate an image pyramid and then look at the application of an image pyramid for blending two images .

Gaussian pyramid with scikit image transform pyramid module

The Gaussian pyramid of the input image can be calculated using scikit-image.transform.pyramidthe module's pyramid_gaussian()functions. Starting from the original image, this function calls pyramid_reduce()the function to obtain the smoothed and downsampled image recursively. The following code block demonstrates how to lenacalculate and display such a Gaussian pyramid using an RGB input image:

from skimage.transform import pyramid_gaussianimage = imread('../images/lena.jpg')nrows, ncols = image.shape[:2]pyramid = tuple(pyramid_gaussian(image, downscale=2))pylab.figure(figsize=(20,5))i, n = 1, len(pyramid)for p in pyramid: pylab.subplot(1,n,i), pylab.imshow(p) pylab.title(str(p.shape[0]) ...

Laplacian pyramid with scikit image transform pyramid module

The Laplacian pyramid of the input image can be calculated using scikit-image.transform.pyramidthe module's pyramid_laplacian()functions. This function starts from the difference image of the original image and its smoothed version, calculates the downsampled image and the smoothed image, and takes the difference of these two images to recursively calculate the image corresponding to each layer. The motivation for creating the Laplacian Pyramid was to achieve compression, since the compression rate is higher for predictable values ​​around 0.

The code used to calculate the Laplacian pyramid is similar to the code used previously to calculate the Gaussian pyramid; this is left as an exercise to the reader. The following screenshot shows lenathe Laplacian pyramid of a grayscale image:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-wh95djqz-1681961321709)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/9705405c-7945-4f35-8f12-b9cb90f58918.png)]

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-eYNYbw7B-1681961321709)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/378a89ab-5e44-45a2-8af0-b49cc8275fb3.png)]

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-Wd0oThH8-1681961321709)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/36cef8b9-2d6b-4479-9991-f668849d575e.png)]

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-zLrseRoL-1681961321709)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/1b524b56-6cc3-494b-b36f-2a755895c1be.png)]

Note that if we use scikit-imagethe pyramid_gaussian()sum pyramid_laplacian()function, the lowest resolution image in the Laplacian pyramid and the lowest resolution image in the Gaussian pyramid will be different images, which we don't want. We want to build a Laplacian pyramid where the minimum resolution image is exactly the same as the Gaussian pyramid, as this will allow us to build the image from its Laplacian pyramid only. In the next few sections we will discuss algorithms for building our own pyramids scikit-imageusing expand()the sum function.reduce()

Construct Gaussian Pyramid

The Gaussian pyramid can be calculated by following these steps:

  1. Start with the original image.
  2. The image is calculated iteratively at each level of the pyramid, first by smoothing the image (using a Gaussian filter) and then downsampling it.
  3. Stop at a level where the image size is small enough (for example, 1 x 1).
  4. The function that implements the previous algorithm is left as an exercise to the reader; we only need to add a few lines to the following function to complete the implementation:
from skimage.transform import pyramid_reduce def get_gaussian_pyramid(image):     '''    input: an RGB image    output: the Gaussian Pyramid of the image as a list    '''    gaussian_pyramid = []    # add code here # iteratively ...

Reconstruct image from Laplacian pyramid only

The image below shows how an image can be reconstructed only from its Laplacian pyramid, if we follow the algorithm described in the previous section to construct the image:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-FqgHwFAI-1681961321709)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/fcd408ae-d122-4097-bfda-10af051d17b5.png)]

Please take a look at the following code block:

def reconstruct_image_from_laplacian_pyramid(pyramid):
 i = len(pyramid) - 2
 prev = pyramid[i+1]
 pylab.figure(figsize=(20,20))
 j = 1
 while i >= 0:
   prev = resize(pyramid_expand(prev, upscale=2), pyramid[i].shape)
   im = np.clip(pyramid[i] + prev,0,1)
   pylab.subplot(3,3,j), pylab.imshow(im)
   pylab.title('Level=' + str(j) + ' ' + str(im.shape[0]) + 'x' + str(im.shape[1]), size=20)
   prev = im
   i -= 1
   j += 1
 pylab.subplot(3,3,j), pylab.imshow(image)
 pylab.title('Original image' + ' ' + str(image.shape[0]) + 'x' + str(image.shape[1]), size=20)
 pylab.show()
 return im

image = img_as_float(imread('../images/apple.png')[...,:3]) # only use the color channels and discard the alpha
pyramid = get_laplacian_pyramid(get_gaussian_pyramid(image))
im = reconstruct_image_from_laplacian_pyramid(pyramid)

The screenshot below shows the output of the previous code, i.e. how the original image is finally constructed through the Laplacian pyramid, simply using operations on the image at each level and adding it iteratively to the next expand()level In the image:

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-RoVFQdkM-1681961321710)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/e7fba4b7-b58b-4064-84e8-848b282ea6d7.png)]

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-FbSGT9sa-1681961321710)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/3a67b791-f7f1-45ba-bcff-1ed6be25bb2f.png)]

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-7lgQe7m4-1681961321710)(https://gitcode.net/apachecn/apachecn-cv-zh/-/raw /master/docs/handson-imgproc-py/img/572a509d-7fc7-4736-ac73-c9be41783105.png)]

Blend image with pyramid

假设我们有两个 RGB 彩色输入图像,a(苹果)和B(橙色),以及第三个二值掩模图像,M;这三幅图像的大小都相同。目标是在遮罩M的引导下,将图像AB混合(如果遮罩图像 M 中的像素值为 1,则表示该像素取自图像A,否则取自图像B。以下算法可用于使用图像AB的拉普拉斯金字塔混合两幅图像(通过使用来自AB的拉普拉斯金字塔相同级别的图像的线性组合计算混合金字塔),使用掩模图像M的高斯金字塔的同一层级的权重),然后从中重建输出图像。。。

总结

In this chapter, we first discussed edge detection of images using several filters (Sobel, Prewitt, Canny, etc.) and by computing the gradient and Laplacian operator of the image. We then discussed LoG/DoG operators and how to implement them and detect edges using zero crossings. Next, we discussed how to compute an image pyramid and use the Laplacian pyramid to smoothly blend two images. Finally, we discussed how to use scikit-imagedetection blobs. After completing this chapter, the reader should be able to implement edge detectors (Sobel, Canny, etc.) in images using Python using different filters. Additionally, readers should be able to implement filters to sharpen images and use LoG/DoG to find edges at different scales. Finally, they should be able to blend images with Laplacian/Gaussian pyramids and achieve blob detection in images in different scale spaces. In the next chapter, we will discuss feature detection and extraction techniques for images.

question

  1. Use skimage.filtersthe module's unsharp_mask()functions and radiusdifferent amountvalues ​​of its parameters to sharpen the image.

  2. Use the functions ImageFilterof the PIL module UnsharpMask()and different values ​​of the radiusand percentparameters to sharpen the image.

  3. Sharpen color (RGB) images using sharpening kernels [[0,-1,0],-1,5,-1],[0,-1,0]]. (Tip: Use the functions signalof the SciPy module one by one for each color channel convolve2d().)

  4. Using the SciPy ndimagemodule, it is possible to sharpen color images directly (without sharpening individual color channels one by one).

  5. Use skimage.transformthe module's pyramid_laplacian()functions to compute and display lenaa Gaussian pyramid with a grayscale input image.

  6. architecture

further reading

Guess you like

Origin blog.csdn.net/wizardforcel/article/details/130262724