Python from zero to one丨Take you to understand the theoretical knowledge and drawing implementation of image histogram

Abstract: This article will introduce how to draw a histogram from two aspects of OpenCV and Matplotlib, which will provide effective support for image processing pixel comparison.

This article is shared from Huawei Cloud Community " [Python from Zero to One] 50. Image Enhancement and Operation: Image Histogram Theoretical Knowledge and Drawing Implementation ", author: eastmount.

1. Theoretical knowledge of image histogram

The grayscale histogram is a function of grayscale, which describes the number of pixels of each grayscale in the image and reflects the frequency of each grayscale in the image. Assume that there is an image of 6×6 pixels, and then count the occurrence frequency of gray levels 1 to 6, and draw a histogram as shown in Figure 1, where the abscissa indicates the gray level, and the ordinate indicates the occurrence of the gray level The frequency of [1-2].

If the gray level is 0-255 (minimum value 0 is black, maximum value 255 is white), the corresponding histogram can also be drawn, as shown in Figure 2, the left is a grayscale image (Lena grayscale image), On the right is the grayscale frequency corresponding to each pixel.

In order to make the frequency of occurrence of each gray level of the image form a fixed standard form, the image histogram can be processed by the normalization method, and the original image to be processed can be converted into the corresponding standard form [3]. Assuming that the variable r represents the gray level of the pixel in the image, after normalization processing, r will be limited to the following range:

In grayscale, r=0 means black and r=1 means white. For a given image, each pixel value is located in the [0,1] interval, and then calculate the gray distribution of the original image, which is realized by the probability density function P®. In order to perform digital image processing better, discrete forms must be introduced. In the discrete form, rk is used to represent the discrete gray level, P(rk) replaces P®, and formula (2) is satisfied.

In the formula, nk is the number of pixels with rk grayscale in the image, n is the total number of pixels in the image, which is the frequency in probability theory, and l is the total number of grayscale levels (usually l is 256 grayscale levels). Then make a relationship diagram between rk and P(rk) in the Cartesian coordinate system, which becomes a gray level histogram [4].

Assuming that there is a 3×3 pixel image whose pixel values ​​are shown in formula (3), the steps of normalizing the histogram are as follows:

First, the number of pixels corresponding to each gray level is counted. Use the x array to count the gray level of the pixel, and the y array to count the number of pixels with the gray level. Among them, there are 3 pixels with a grayscale of 1, 1 pixel with a grayscale of 2, 2 pixels with a grayscale of 3, 1 pixel with a grayscale of 4, and 2 pixels with a grayscale of 5. indivual.

Then count the total number of pixels, as shown in formula (5).

Finally, the probability of occurrence of each gray level is counted and calculated by formula (6), and the results are as follows:

The drawn normalized graph is shown in Figure 3, the abscissa indicates the gray level of each pixel in the image, and the ordinate indicates the probability of this gray level appearing.

Histograms are widely used in the field of computer vision. When using edges and colors to determine the boundaries of objects, the histograms can better select the boundary threshold for thresholding. At the same time, the histogram is particularly useful for the segmentation of scenes with strong contrast between objects and backgrounds, and can be applied to detect scene changes in videos and points of interest in images.

2. OpenCV draws histogram

First, explain how to use the OpenCV library to draw a histogram. In OpenCV, you can use the calcHist() function to calculate the histogram. After the calculation is completed, use the drawing functions in OpenCV, such as the rectangle() function for drawing rectangles, and the line() function for drawing line segments. Among them, the function prototype and six common parameters of cv2.calcHist() are as follows:

hist = cv2.calcHist(images, channels, mask, histSize, ranges, accumulate)

  • hist represents a histogram and returns a two-dimensional array
  • images represents the original image of the input
  • channels indicates the specified channel, and the channel number needs to use square brackets. When the input image is a grayscale image, its value is [0], and for a color image, it is [0], [1], [2], which represent blue (B ), Green (G), Red (R)
  • mask represents an optional operation mask. If you want to count the histogram of the entire image, the value is None; if you want to count the histogram of a part of the image, you need a mask to calculate
  • histSize represents the number of gray levels, need to use square brackets, such as [256]
  • Ranges represent the range of pixel values, such as [0, 255]
  • accumulate indicates the cumulative overlay flag, the default is false, if it is set to true, the histogram will not be cleared at the beginning of allocation, this parameter allows a single histogram to be calculated from multiple objects, or used to update the histogram in real time; The cumulative result of multiple histograms is used for the histogram calculation on a set of images

The next code is to calculate the size, shape and frequency of each gray level of the image, and then call the plot() function to draw the histogram curve.

# -*- coding: utf-8 -*-
# By:Eastmount
import cv2  
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
#读取图像
src = cv2.imread('lena-hd.png')
#计算256灰度级的图像直方图
hist = cv2.calcHist([src], [0], None, [256], [0,255])
#输出直方图大小、形状、数量
print(hist.size)
print(hist.shape)
print(hist)
#设置字体
matplotlib.rcParams['font.sans-serif']=['SimHei']
#显示原始图像和绘制的直方图
plt.subplot(121)
plt.imshow(src, 'gray')
plt.axis('off')
plt.title("(a)Lena灰度图像")
plt.subplot(122)
plt.plot(hist, color='r')
plt.xlabel("x")
plt.ylabel("y")
plt.title("(b)直方图曲线")
plt.show()

The histogram curve corresponding to the "Lena" grayscale image drawn by the above code is shown in Figure 4, Figure 4(a) shows the original image, and Figure 4(b) shows the corresponding grayscale histogram curve.

Simultaneously output the size, shape and quantity of the histogram, as follows:

256
(256L, 1L)
[[7.000e+00]
 [1.000e+00]
 [0.000e+00]
 [6.000e+00]
 [2.000e+00]
 ....
 [1.000e+00]
 [3.000e+00]
 [2.000e+00]
 [1.000e+00]
 [0.000e+00]]

The algorithm of calling OpenCV to draw a histogram of a color image is the same as that of a grayscale image, except that B, G, and R are calculated and drawn separately. The specific code is as follows.

# -*- coding: utf-8 -*-
# By:Eastmount
import cv2  
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
#读取图像
src = cv2.imread('lena.png')
#转换为RGB图像
img_rgb = cv2.cvtColor(src, cv2.COLOR_BGR2RGB)
#计算直方图
histb = cv2.calcHist([src], [0], None, [256], [0,255])
histg = cv2.calcHist([src], [1], None, [256], [0,255])
histr = cv2.calcHist([src], [2], None, [256], [0,255])
#设置字体
matplotlib.rcParams['font.sans-serif']=['SimHei']
#显示原始图像和绘制的直方图
plt.subplot(121)
plt.imshow(img_rgb, 'gray')
plt.axis('off')
plt.title("(a)Lena原始图像")
plt.subplot(122)
plt.plot(histb, color='b')
plt.plot(histg, color='g')
plt.plot(histr, color='r')
plt.xlabel("x")
plt.ylabel("y")
plt.title("(b)直方图曲线")
plt.show()

The final drawn "Lena" color image and its corresponding color histogram curve are shown in Figure 5, where Figure 5(a) represents the original image of Lena, and Figure 5(b) represents the corresponding color histogram curve.

3. Matplotlib draws histograms

Matplotlib is a powerful data visualization tool for Python, mainly used to draw various 2D graphics. In this section, Python draws the histogram mainly by calling the hist() function in the matplotlib.pyplot library, which draws the histogram according to the data source and pixel level. Its function mainly includes five commonly used parameters, as follows:

n, bins, patches = plt.hist(arr, bins=50, normed=1, facecolor=‘green’, alpha=0.75)

  • arr represents the one-dimensional array whose histogram needs to be calculated
  • bins indicates the number of bars displayed in the histogram, optional, the default value is 10
  • normed indicates whether to perform vector normalization on the obtained histogram, the default value is 0
  • facecolor represents the histogram color
  • alpha means transparency
  • n is the return value, representing the histogram vector
  • bins is the return value, indicating the interval range of each bin
  • patches is the return value, which means to return the data contained in each bin, which is a list

The Python implementation code of the image histogram is shown below. This example is mainly drawn through the hist() function in the matplotlib.pyplot library. Note that the pixels of the read "lena-hd.png" image are two-dimensional arrays, and the data source of the hist() function must be a one-dimensional array, and the image usually needs to be straightened through the function ravel().

# -*- coding: utf-8 -*-
# By:Eastmount
import cv2  
import numpy as np
import matplotlib.pyplot as plt
#读取图像
src = cv2.imread('lena-hd.png')
#绘制直方图
plt.hist(src.ravel(), 256)
plt.xlabel("x")
plt.ylabel("y")
plt.show()
#显示原始图像
cv2.imshow("src", src)
cv2.waitKey(0)
cv2.destroyAllWindows()

The grayscale image of "lena" displayed by reading is shown in Figure 6.

The final grayscale histogram is shown in Figure 7, which draws the 256-level grayscale of the Lena map and the frequency of each grayscale level, where the x-axis represents the 256-level grayscale of the image, and the y-axis represents the frequency of each grayscale level .

If the following function is called, the drawn histogram is a normalized histogram with a green color and a transparency of 0.75, as shown in Figure 8.

plt.hist(src.ravel(), bins=256, density=1, facecolor=‘green’, alpha=0.75)

A color histogram is a special case of a high-dimensional histogram, which counts the frequency of each RGB component of a color picture, that is, the color probability distribution information. The histogram of the color image is the same as the grayscale histogram, but the histograms of the three channels are drawn separately and then superimposed. The code is as follows. The original color image of Lena is shown in Figure 9.

# -*- coding: utf-8 -*-
# By:Eastmount
import cv2  
import numpy as np
import matplotlib.pyplot as plt
#读取图像
src = cv2.imread('Lena.png')
#获取BGR三个通道的像素值
b, g, r = cv2.split(src)
#绘制直方图
plt.figure("Lena")
#蓝色分量
plt.hist(b.ravel(), bins=256, density=1, facecolor='b', edgecolor='b', alpha=0.75)
#绿色分量
plt.hist(g.ravel(), bins=256, density=1, facecolor='g', edgecolor='g', alpha=0.75)
#红色分量
plt.hist(r.ravel(), bins=256, density=1, facecolor='r', edgecolor='r', alpha=0.75)
plt.xlabel("x")
plt.ylabel("y")
plt.show()
#显示原始图像
cv2.imshow("src", src)
cv2.waitKey(0)
cv2.destroyAllWindows()

The drawn color histogram is shown in Figure 10, including three contrasts of red, green, and blue.

If you want to draw and compare the histograms of the three color components separately, you can use the following code to achieve it, call the plt.figure(figsize=(8, 6)) function to draw the window, and the plt.subplot() function to draw 4 respectively a subgraph.

# -*- coding: utf-8 -*-
# By:Eastmount
import cv2  
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
#读取图像
src = cv2.imread('lena.png')
#转换为RGB图像
img_rgb = cv2.cvtColor(src, cv2.COLOR_BGR2RGB)
#获取BGR三个通道的像素值
b, g, r = cv2.split(src)
print(r,g,b)
plt.figure(figsize=(8, 6))
#设置字体
matplotlib.rcParams['font.sans-serif']=['SimHei']
#原始图像
plt.subplot(221)
plt.imshow(img_rgb)
plt.axis('off')
plt.title("(a)原图像")
#绘制蓝色分量直方图
plt.subplot(222)
plt.hist(b.ravel(), bins=256, density=1, facecolor='b', edgecolor='b', alpha=0.75)
plt.xlabel("x")
plt.ylabel("y")
plt.title("(b)蓝色分量直方图")
#绘制绿色分量直方图
plt.subplot(223)
plt.hist(g.ravel(), bins=256, density=1, facecolor='g', edgecolor='g', alpha=0.75)
plt.xlabel("x")
plt.ylabel("y")
plt.title("(c)绿色分量直方图")
#绘制红色分量直方图
plt.subplot(224)
plt.hist(r.ravel(), bins=256, density=1, facecolor='r', edgecolor='r', alpha=0.75)
plt.xlabel("x")
plt.ylabel("y")
plt.title("(d)红色分量直方图")
plt.show()

The final output graph is shown in Figure 11, Figure 11(a) shows the original image, Figure 11(b) shows the blue component histogram, Figure 11© shows the green component histogram, and Figure 11(d) shows the red classification histogram picture.

Four. Summary

This article mainly explains the theoretical knowledge of image histograms and histogram drawing methods, and includes two statistical and drawing methods of Matplotlib and OpenCV. The grayscale histogram is a function of grayscale, which describes the number of pixels of each grayscale in the image and reflects the frequency of each grayscale in the image. The knowledge points of this article will provide support for subsequent image processing and image operation comparison.

references:

  • [1] Gonzalez. Digital Image Processing (3rd Edition) [M]. Beijing: Electronic Industry Press, 2013.
  • [2] Zhang Hengbo, Ou Zongying. An Image Retrieval Method Based on Color and Gray Histogram[J]. Computer Engineering, 2004.
  • [3] Eastmount. [Digital Image Processing] 4. MFC dialog box draws grayscale histogram [EB/OL]. (2015-05-31). https://blog.csdn.net/eastmount/article/details/ 46237463.
  • [4] Ruan Qiuqi. Digital Image Processing (3rd Edition) [M]. Beijing: Electronic Industry Press, 2008.
  • [5] Eastmount. [Python image processing] Eleven. The concept of gray histogram and OpenCV drawing histogram [EB/OL]. (2018-11-06). https://blog.csdn.net/Eastmount/article /details/83758402.

 

Click to follow and learn about Huawei Cloud's fresh technologies for the first time~

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/8797794