The main content of this article comes from the image processing part of OpenCV in the OpenCV-Python tutorial . The main content of this part is as follows:
-
Learn to change images between different color spaces. Also learn to track colorful objects in videos.
-
Geometric transformation of images
Learn to apply different geometric transformations to images, such as rotation, translation, etc.
-
Learn to convert images into binary images using global thresholding, adaptive thresholding, Otsu's binarization, and more.
-
Learn to blur images, filter images with custom kernels, and more.
-
Understand morphological transformations such as erosion, expansion, opening, closing, etc.
-
Learn to find image gradients, edges, and more.
-
Learn to find edges with Canny edge detection.
-
Learn about image pyramids and how to use them for image blending.
-
All about contours in OpenCV.
-
All about histograms in OpenCV.
-
Image transformation in OpenCV
Different image transformations are encountered in OpenCV, such as Fourier transform, cosine transform, etc.
-
Learn to use template matching to search for objects in images.
-
Learn to detect lines in an image.
-
Learn to detect circles in an image.
-
Image segmentation using watershed algorithm
Learn to segment images using the watershed segmentation algorithm.
-
Interactive foreground extraction using GrabCut algorithm
Learn to use GrabCut algorithm to extract foreground
Target
- In this tutorial, we will learn about Simple Thresholding, Adaptive Thresholding, and Otsu Thresholding.
- We will learn about the cv.threshold and cv.adaptiveThreshold functions.
simple threshold
Here, things are straightforward. For each pixel, the same threshold is applied. If the pixel value is smaller than the threshold, it is set to 0, otherwise it is set to the maximum value. The function cv.threshold is used to apply a threshold. The first parameter is the source image, which should be a grayscale image . The second parameter is the threshold used to classify pixel values. The third parameter is the maximum value assigned to pixel values that exceed the threshold. OpenCV provides different types of thresholds, which are given by the fourth parameter of the function. Basic thresholding as described above is accomplished using type cv.THRESH_BINARY . All simple threshold types are as follows:
Consult these types of documentation to understand the differences between them.
These methods put back two outputs. The first is the threshold used and the second output is the thresholded image .
The following code compares different simple threshold types:
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
def simple_thresholding():
cv.samples.addSamplesDataSearchPath("/media/data/my_multimedia/opencv-4.x/samples/data")
img = cv.imread(cv.samples.findFile('gradient.png'), 0)
# img = cv.imread('gradient.png', 0)
ret, thresh1 = cv.threshold(img, 127, 255, cv.THRESH_BINARY)
ret, thresh2 = cv.threshold(img, 127, 255, cv.THRESH_BINARY_INV)
ret, thresh3 = cv.threshold(img, 127, 255, cv.THRESH_TRUNC)
ret, thresh4 = cv.threshold(img, 127, 255, cv.THRESH_TOZERO)
ret, thresh5 = cv.threshold(img, 127, 255, cv.THRESH_TOZERO_INV)
titles = ['Original Image', 'BINARY', 'BINARY_INV', 'TRUNC', 'TOZERO', 'TOZERO_INV']
images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]
for i in range(6):
plt.subplot(2, 3, i + 1), plt.imshow(images[i], 'gray', vmin=0, vmax=255)
plt.title(titles[i])
plt.xticks([]), plt.yticks([])
plt.show()
if __name__ == "__main__":
simple_thresholding()
Note
that in order to plot multiple images, we use the plt.subplot() function. Please refer to matplotlib's documentation for more details.
The above code generates the following results:
adaptive threshold
In the previous section, we used a global value as the threshold. But this may not work in all cases, for example if the image has different lighting conditions in different areas. In this case, adaptive thresholds can help. Here, the algorithm determines the threshold for a pixel based on the small patch around it. Therefore, we obtain different thresholds for different areas of the same image, which gives better results for images with different lighting.
In addition to the parameters described above, the cv.adaptiveThreshold method accepts three input parameters:
adaptiveMethod determines how the threshold is calculated:
- cv.ADAPTIVE_THRESH_MEAN_C : The threshold is the mean of the neighborhood area minus the constant C.
- cv.ADAPTIVE_THRESH_GAUSSIAN_C : The threshold is the Gaussian-weighted sum of neighborhood values minus a constant C.
blockSize determines the size of the neighborhood, and C is a constant subtracted from the average or weighted sum of neighborhood pixels.
The code below compares global and adaptive thresholding for images with different lighting:
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
def adaptive_thresholding():
cv.samples.addSamplesDataSearchPath("/media/data/my_multimedia/opencv-4.x/samples/data")
img = cv.imread(cv.samples.findFile('sudoku.png'), 0)
img = cv.medianBlur(img, 5)
ret, th1 = cv.threshold(img, 127, 255, cv.THRESH_BINARY)
th2 = cv.adaptiveThreshold(img, 255, cv.ADAPTIVE_THRESH_MEAN_C, cv.THRESH_BINARY, 11, 2)
th3 = cv.adaptiveThreshold(img, 255, cv.ADAPTIVE_THRESH_GAUSSIAN_C, cv.THRESH_BINARY, 11, 2)
titles = ['Original Image', 'Global Thresholding (v = 127)',
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]
for i in range(4):
plt.subplot(2, 2, i + 1), plt.imshow(images[i], 'gray')
plt.title(titles[i])
plt.xticks([]), plt.yticks([])
plt.show()
if __name__ == "__main__":
adaptive_thresholding()
The final result is as follows:
It seems that cv.ADAPTIVE_THRESH_GAUSSIAN_C has a certain image noise reduction effect.
Binarization of Otsu
In global thresholding, we use an arbitrarily chosen value as the threshold. In contrast, Otsu's approach avoids having to choose a value and determine it automatically.
Consider an image with only two distinct image values ( bimodal image ), where the histogram contains only two peaks. A good threshold will be in the middle of these two values. Similarly, Otsu's method determines an optimal global threshold from the image histogram.
To do this, use the cv.threshold() function, where cv.THRESH_OTSU is passed as an extra flag. The threshold can be chosen arbitrarily. The algorithm then finds the optimal threshold to return as the first output.
Take a look at the example below. The input image is a noisy image. In the first case, a global threshold of 127 is applied. In the second case, Otsu's threshold is applied directly. In the third case, the image is first filtered using a 5x5 Gaussian kernel to remove noise, and then an Otsu threshold is applied. See how noise filtering improves results.
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
def otsu_thresholding():
cv.samples.addSamplesDataSearchPath("/media/data/my_multimedia/opencv-4.x/samples/data")
img = cv.imread(cv.samples.findFile('noisy2.png'), 0)
#img = cv.imread('noisy2.png', 0)
# global thresholding
ret1, th1 = cv.threshold(img, 127, 255, cv.THRESH_BINARY)
# Otsu's thresholding
ret2, th2 = cv.threshold(img, 0, 255, cv.THRESH_BINARY + cv.THRESH_OTSU)
# Otsu's thresholding after Gaussian filtering
blur = cv.GaussianBlur(img, (5, 5), 0)
ret3, th3 = cv.threshold(blur, 0, 255, cv.THRESH_BINARY + cv.THRESH_OTSU)
# plot all the images and their histograms
images = [img, 0, th1,
img, 0, th2,
blur, 0, th3]
titles = ['Original Noisy Image', 'Histogram', 'Global Thresholding (v=127)',
'Original Noisy Image', 'Histogram', "Otsu's Thresholding",
'Gaussian filtered Image', 'Histogram', "Otsu's Thresholding"]
for i in range(3):
plt.subplot(3, 3, i * 3 + 1), plt.imshow(images[i * 3], 'gray')
plt.title(titles[i * 3]), plt.xticks([]), plt.yticks([])
plt.subplot(3, 3, i * 3 + 2), plt.hist(images[i * 3].ravel(), 256)
plt.title(titles[i * 3 + 1]), plt.xticks([]), plt.yticks([])
plt.subplot(3, 3, i * 3 + 3), plt.imshow(images[i * 3 + 2], 'gray')
plt.title(titles[i * 3 + 2]), plt.xticks([]), plt.yticks([])
plt.show()
if __name__ == "__main__":
otsu_thresholding()
The noise2.png image used in the code is as follows:
The result is as follows:
How does Otsu's binarization work?
This section demonstrates a binary Python implementation of Otsu to show how it actually works. If you are not interested, you can skip this.
Since we are using bimodal images, Otsu's algorithm attempts to find a threshold (t) that minimizes the weighted intra-class variance given by the relationship:
σ w 2 ( t ) = q 1 ( t ) σ 1 2 ( t ) + q 2 ( t ) σ 2 2 ( t ) \sigma_w^2(t) = q_1(t)\sigma_1^2(t)+q_2(t)\sigma_2^2(t) pw2(t)=q1( t ) p12(t)+q2( t ) p22(t)
其中:
σ 1 2 ( t ) = ∑ i = 1 t [ i − μ 1 ( t ) ] 2 P ( i ) q 1 ( t ) & σ 2 2 ( t ) = ∑ i = t + 1 I [ i − μ 2 ( t ) ] 2 P ( i ) q 2 ( t ) \sigma_1^2(t) = \sum_{i=1}^{t} [i-\mu_1(t)]^2 \frac{P(i)}{q_1(t)} \quad \& \quad \sigma_2^2(t) = \sum_{i=t+1}^{I} [i-\mu_2(t)]^2 \frac{P(i)}{q_2(t)} p12(t)=i=1∑t[i−m1(t)]2q1(t)P(i)&p22(t)=i=t+1∑I[i−m2(t)]2q2(t)P(i)
μ 1 ( t ) = ∑ i = 1 t i P ( i ) q 1 ( t ) & μ 2 ( t ) = ∑ i = t + 1 I i P ( i ) q 2 ( t ) \mu_1(t) = \sum_{i=1}^{t} \frac{iP(i)}{q_1(t)} \quad \& \quad \mu_2(t) = \sum_{i=t+1}^{I} \frac{iP(i)}{q_2(t)} m1(t)=i=1∑tq1(t)iP(i)&m2(t)=i=t+1∑Iq2(t)iP(i)
σ 1 2 ( t ) = ∑ i = 1 t [ i − μ 1 ( t ) ] 2 P ( i ) q 1 ( t ) & σ 2 2 ( t ) = ∑ i = t + 1 I [ i − μ 2 ( t ) ] 2 P ( i ) q 2 ( t ) \sigma_1^2(t) = \sum_{i=1}^{t} [i-\mu_1(t)]^2 \frac{P(i)}{q_1(t)} \quad \& \quad \sigma_2^2(t) = \sum_{i=t+1}^{I} [i-\mu_2(t)]^2 \frac{P(i)}{q_2(t)} p12(t)=i=1∑t[i−m1(t)]2q1(t)P(i)&p22(t)=i=t+1∑I[i−m2(t)]2q2(t)P(i)
It actually finds a t value that lies between two peaks such that the variance of both classes is minimal. It can be implemented simply in Python as follows:
def otsu_alg():
img = cv.imread('noisy2.png', 0)
blur = cv.GaussianBlur(img, (5, 5), 0)
# find normalized_histogram, and its cumulative distribution function
hist = cv.calcHist([blur], [0], None, [256], [0, 256])
hist_norm = hist.ravel() / hist.sum()
Q = hist_norm.cumsum()
bins = np.arange(256)
fn_min = np.inf
thresh = -1
for i in range(1, 256):
p1, p2 = np.hsplit(hist_norm, [i]) # probabilities
q1, q2 = Q[i], Q[255] - Q[i] # cum sum of classes
if q1 < 1.e-6 or q2 < 1.e-6:
continue
b1, b2 = np.hsplit(bins, [i]) # weights
# finding means and variances
m1, m2 = np.sum(p1 * b1) / q1, np.sum(p2 * b2) / q2
v1, v2 = np.sum(((b1 - m1) ** 2) * p1) / q1, np.sum(((b2 - m2) ** 2) * p2) / q2
# calculates the minimization function
fn = v1 * q1 + v2 * q2
if fn < fn_min:
fn_min = fn
thresh = i
# find otsu's threshold value with OpenCV function
ret, otsu = cv.threshold(blur, 0, 255, cv.THRESH_BINARY + cv.THRESH_OTSU)
print("{} {}".format(thresh, ret))
Other resources
- Digital Image Processing, Rafael C. Gonzalez
practise
- Otsu's binarization has some optimizations. You can search and implement it.
Reference documentation
Done.