Traditional CV algorithm - edge detection

Article directory

传统CV算法-边缘检测

第一章 概述

1. 边缘检测概述

1.1 认识边缘

边缘的定义

  • 边缘是不同区域的分界线,是周围(局部)灰度值有显著变化的像素点的集合,有幅值方向两个属性。这个不是绝对的定义,边缘是局部特征,以及周围灰度值显著变化会产生边缘

轮廓和边缘的关系

  • 轮廓代表的整体特征,边缘代表局部特征

  • 一般认为轮廓是对物体的完整边界的描述,边缘点一个个连接起来构成轮廓。边缘可以 是一段边缘,而轮廓一般是完整的。人眼视觉特性,看物体时一般是先获取物体的轮廓信息, 再获取物体中的细节信息。

  • 阶跃型、屋脊型、斜坡型、脉冲型

    • 区别:变化的快慢不同
    • 在边缘检测中更多关注的是阶跃和 屋脊型边缘。
    • (a)和(b)可认为是阶跃或斜坡型,(c)脉冲型,(d)屋脊型, 阶跃与屋脊的不同在于阶跃上升或下降到某个值后持续下去,而屋脊则是先上升后下降。

image-20221229174741330

1.2 边缘检测的概念

  • 边缘检测是图像处理与计算机视觉中极为重要的一种分析图像的方法
  • 目的是找到 图像中亮度变化剧烈的像素点构成的集合,表现出来往往是轮廓
  • 如果图像中边缘能够精确 的测量和定位,那么,就意味着实际的物体能够被定位和测量,包括物体的面积、物体的直 径、物体的形状等就能被测量。在对现实世界的图像采集中,有下面 4 种情况会表现在图像 中时形成一个边缘。
    1. 深度的不连续(物体处在不同的物平面上);
    2. 表面方向的不连续(如正方体的不同的两个面);
    3. 物体材料不同(这样会导致光的反射系数不同);
    4. 场景中光照不同(如被树萌投向的地面);

image-20221229175035989

举例:例如上面的图像是图像中水平方向 7 个像素点的灰度值显示效果,我们很容易地判断在 第 4 和第 5 个像素之间有一个边缘,因为它俩之间发生了强烈的灰度跳变。在实际的边缘 检测中,边缘远没有上图这样简单明显,我们需要取对应的阈值来区分出它们。

1.3 边缘检测的基本方法

  1. 图像滤波
    • 传统边缘检测算法主要是基于图像强度的一阶和二阶导数,但导数的计算对噪声很敏感, 因此必须使用滤波器来改善与噪声有关的边缘检测器的性能。
    • 大多数滤波器 在降低噪声的同时也造成了边缘强度的损失,因此,在增强边缘降低噪声之间需要一个 折中的选择。
  2. 图像增强
    • 增强边缘的基础是确定图像各点邻域强度的变化值。
    • 增强算法可以将邻域(或局部)强度值有显著变化的点突显出来。
    • 边缘增强一般是通过计算梯度的幅值来完成的。
  3. 图像检测
    • 在图像中有许多点的梯度幅值比较大,而这些点在特定的应用领域中并不都是边缘,所以应该用某种检测方法来确定哪些点是边缘点。
    • 最简单的边缘检测判断依据是梯度幅值。
  4. 图像定位
    • 如果某一应用场合要求确定边缘位置,则边缘的位置可在子像素分辨率上来估计,边缘 的方位也可以被估计出来。

1.4 边缘检测算子的概念

image-20221229175357000

  • 在数学中,函数的变化率由导数来刻画.

  • 图像我们看成二维函数,其上面的像素值变化, 当然也可以用导数来刻画,当然图像是离散的,那我们换成像素的差分来实现。

  • 边缘检测算子的类型存在一阶二阶微分算子。对于阶跃型边缘,显示其一阶导数具有极大值,极大值点对应二阶导数的过零点,也就是准确的边缘的位置是对应于一阶导数的极大值点,或者二阶导数的过零点(注意不仅仅是二阶导 数为 0 值的位置,二值正负值过渡的零点)。

1.5 常见的边缘检测算子

image-20221229175716969

  • 常见的一阶微分边缘算子包括 Roberts,Prewitt,Sobel,Kirsch 以及 Nevitia

  • 常见的 二阶微分边缘算子包括 Laplace 算子,LOG 算子,DOG 算子和 Canny 算子等。其中 Canny

算子是最为常用的一种,也是当前被认为最优秀的边缘检测算子。

  • In addition, there is an edge detection method SUSAN, which does not use the gradient (derivative) of the image pixels. And this article will outline some emerging edge detection methods, such as wavelet analysis , fuzzy algorithms and artificial neural networks , etc.

2. Using the gradient operator to realize the principle of edge detection

2.1 Know the gradient operator

  • Edge points : the maximum point of the first-order differential amplitude and the zero point of the second-order differential.

  • The definition of gradient : the degree of inclination of a surface along a given direction. In a single-variable function, the gradient is just the derivative. In a linear function, it is the slope of the line - a vector with a direction.

  • Gradient operator : Gradient is a first-order differential operator, corresponding to the first-order derivative. If the image contains less noise and the gray value transition at the edge of the image is more obvious, the gradient operator can obtain better edge detection results. The operators such as Roberts and Sobel introduced in the previous chapter are all gradient operators.

2.2 Gradient measurement

For the continuous function f ( x , y ) \mathrm{f}(\mathrm{x}, \mathrm{y})f(x,y ) , we calculated it at( x , y ) (\mathrm{x}, \mathrm{y})(x,y ) , and use a vector (alongx \mathrm{x}x 方向和沿 y \mathrm{y} y 方向的两个分量)来表示, 如下:
G ( x , y ) = [ G x G y ] = [ ∂ f ∂ x ∂ f ∂ y ] \mathrm{G}(\mathrm{x}, \mathrm{y})=\left[\begin{array}{l} G_x \\ G_y \end{array}\right]=\left[\begin{array}{l} \frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \end{array}\right] G(x,y)=[GxGy]=[xfyf]
To measure the magnitude of the gradient, the following three norms can be used:
∣ G ( x , y ) ∣ = G x 2 + G y 2 , 2 norm gradient∣ G ( x , y ) ∣ = ∣ G x ∣ + ∣ G y ∣ , 1 norm gradient∣ G ( x , y ) ∣ ≈ max ⁡ ( ∣ G x ∣ , ∣ G y ∣ ) , , ∞ norm gradient\begin{gathered} |\mathrm{G}( \mathrm{x}, \mathrm{y})|=\sqrt{G_x^2+G_y^2}, 2 \text { norm gradient} \\ |\mathrm{G}(\mathrm{x}, \ mathrm{y})|=\left|G_x\right|+\left|G_y\right|, \quad 1 \text { norm gradient} \\ |\mathrm{G}(\mathrm{x}, \mathrm {y})| \approx \max \left(\left|G_x\right|,\left|G_y\right|\right), \quad, \quad \infty \text { norm gradient} \end{gathered}G(x,y)=Gx2+Gy2 ,2 norm gradient  G(x,y)=Gx+Gy,1 norm gradient  G(x,y)max(Gx,Gy),, 范数梯度 
备注:使用 2 范数梯度要对图像中的每个像素点进行平方及开方运算, 计算 复杂度高, 在实际应用中, 通常取绝对值(1范数)或最大值(无穷范数)来近似代替该运算以实现简化, 与平方及开方运算相比, 取绝对值或最大值进行的边缘检测的准确度和边缘的精度差异都很小。

2.3 使用梯度算子实现边缘检测

原理 基于梯度算子的边缘检测大多数是基于方向导数求卷积的方法

实现过程

以 3×3 的卷积模板为例。

image-20221229180311079

设定好卷积模板后, 将模板在图像中移动, 并将图像中的每个像素点与此模板进行卷积, 得到每个像素点的响应 R \mathrm{R} R, 用 R \mathrm{R} R 来表征每个像素点的邻域灰度值变化率, 即灰度梯度值, 从而可将灰度图像经过与模板卷积后转化为梯度图像。模板系数 W i ( i = 1 , 2 , 3 , ⋯ ⋯ 9 ) W_i(\mathrm{i}=1,2,3, \cdots \cdots 9) Wi(i=1,2,3,⋯⋯9) 相加的 总和必须为零, 以确保在灰度级不变的区域中模板的响应为零。 Z \mathrm{Z} Z 表示像素的灰度值
R = W 1 Z 1 + W 2 Z 2 + ⋯ + W 9 Z 9 \mathrm{R}=W_1 Z_1+W_2 Z_2+\cdots+W_9 Z_9 R=W1Z1+W2Z2++W9Z9
通过设定一个阈值, 如果卷积的结果 R \mathrm{R} R 大于这个阈值, 那么该像素点为边缘点, 输出白色; 如果 R \mathrm{R} R 小于这个阈值, 那么该像素不为边缘点, 输出黑色。于是最终我们就能输 出一幅黑白的梯度图像, 实现边缘的检测。

第二章 一阶微分边缘算子

1. 一阶微分边缘算子基本思想

一阶微分边缘算子:也称梯度边缘算子, 利用图像在边缘处的阶跃性, 即图像梯度在边缘取得极大值的特性进行边缘检测。梯度是一个矢量, 它具有方向 θ \theta θ 和模 ∣ Δ I ∣ : |\Delta \mathrm{I}|: ∣ΔI:
Δ I = ( ∂ I ∂ x ∂ I ∂ y ) ∣ Δ I ∣ = ( ∂ I ∂ x ) 2 + ( ∂ I ∂ y ) 2 = I x 2 + I y 2 θ = arctan ⁡ ( I y / I x ) \begin{gathered} \Delta I=\left(\begin{array}{c} \frac{\partial I}{\partial x} \\ \frac{\partial I}{\partial y} \end{array}\right) \\ |\Delta I|=\sqrt{\left(\frac{\partial I}{\partial x}\right)^2+\left(\frac{\partial I}{\partial y}\right)^2}=\sqrt{I_x^2+I_y^2} \\ \theta=\arctan \left(I_y / I_x\right) \end{gathered} ΔI _=(xIyI)∣ΔI=(xI)2+(yI)2 =Ix2+Iy2 i=arctan(Iy/Ix)

  • The direction of the gradient provides edge trend information. The gradient direction is always perpendicular to the edge direction. The magnitude of the gradient magnitude provides edge intensity information.

  • In practical applications, finite differences are usually used for gradient approximation. For the above formula, we have the following approximation:

∂ I ∂ x = lim ⁡ h → 0 I ( x + Δ x , y ) − I ( x , y ) Δ x ≈ I ( x + 1 , y ) − I ( x , y ) , ( Δ x = 1 ) ∂ I ∂ y = lim ⁡ h → 0 I ( x , y + Δ x y ) − I ( x , y ) Δ y ≈ I ( x , y + 1 ) − I ( x , y ) , ( Δ y = 1 ) \begin{aligned} & \frac{\partial I}{\partial x}=\lim _{h \rightarrow 0} \frac{I(x+\Delta x, y)-I(x, y)}{\Delta x} \approx I(x+1, y)-I(x, y),(\Delta x=1) \\ & \frac{\partial I}{\partial y}=\lim _{h \rightarrow 0} \frac{I(x, y+\Delta x y)-I(x, y)}{\Delta y} \approx I(x, y+1)-I(x, y),(\Delta y=1) \end{aligned} xI=h0limΔx _I(x+Δ x ,y)I(x,y)I(x+1,y)I(x,y),( Δx _=1)yI=h0limΔy _I(x,y+Δxy)I(x,y)I(x,y+1)I(x,y),( Δ y=1)

2. Roberts operator

2.1 Thoughts of Roberts Algorithm

  • The 2x2 template uses the difference between two adjacent pixels in the diagonal direction.
  • Judging from the actual effect of image processing, edge positioning is more accurate and sensitive to noise.

The operator proposed by Roberts is an operator that uses local difference operators to find edges. The sharpness of the edge is determined by the gradient of the image grayscale. The gradient is a vector, ∇ f \nabla \mathrm{f}f indicates the fastest direction and amount of grayscale change.
Therefore, the simplest edge detection operator is to use the vertical and horizontal differences of the image to approximate the gradient operator:
∇ f = ( f ( x , y ) − f ( x − 1 , y ) , f ( x , y ) − f ( x , y − 1 ) ) \nabla f=(f(x, y)-f(x-1, y), f(x, y)-f(x, y-1))f=(f(x,y)f(x1,y),f(x,y)f(x,y1 ))
Calculate the vector of the above formula for each pixel, find its absolute value, and then compare it with the threshold. Using this idea, we get the Roberts crossover operator: g (i, j) = ∣ f
( i , j ) − f ( i + 1 , j + 1 ) ∣ + ∣ f ( i , j + 1 ) − f ( i + 1 , j ) ∣ g(i, j)=|f(i, j) -f(i+1, j+1)|+|f(i, j+1)-f(i+1, j)|g(i,j)=f(i,j)f(i+1,j+1)+f(i,j+1)f(i+1,j)

2.2 Roberts算法步骤

  • 选用 1 范数梯度计算梯度幅度:

∣ G ( x , y ) ∣ = ∣ G x ∣ + ∣ G y ∣ , 1  范数梯度  \begin{gathered} |\mathrm{G}(\mathrm{x}, \mathrm{y})|=\left|G_x\right|+\left|G_y\right|, \quad 1 \text { 范数梯度 } \end{gathered} G(x,y)=Gx+Gy,1 范数梯度 

  • 卷积模板为:

image-20221229183424515
G x = 1 ∗ f ( x , y ) + 0 ∗ f ( x + 1 , y ) + 0 ∗ f ( x , y + 1 ) + ( − 1 ) ∗ f ( x + 1 , y + 1 ) = f ( x , y ) − f ( x + 1 , y + 1 ) G y = 0 ∗ f ( x , y ) + 1 ∗ f ( x + 1 , y ) + ( − 1 ) ∗ f ( x , y + 1 ) + 0 ∗ f ( x + 1 , y + 1 ) = f ( x + 1 , y ) − f ( x , y + 1 ) G ( x , y ) = ∣ G x ∣ + ∣ G y ∣ = ∣ f ( x , y ) − f ( x + 1 , y + 1 ) ∣ + ∣ f ( x + 1 , y ) − f ( x , y + 1 ) \begin{aligned} & G_x=1^* f(x, y)+0^* f(x+1, y)+0^* f(x, y+1)+(-1) * f(x+1, y+1) \\ & =f(x, y)-f(x+1, y+1) \\ & G_y=0^* f(x, y)+1^* f(x+1, y)+(-1) * f(x, y+1)+0^* f(x+1, y+1) \\ & =f(x+1, y)-f(x, y+1) \\ & G(x, y)=\left|G_x\right|+\left|G_y\right|=|f(x, y)-f(x+1, y+1)|+\mid f(x+1, y)-f(x, y+1) \end{aligned} Gx=1f(x,y)+0f(x+1,y)+0f(x,y+1)+(1)f(x+1,y+1)=f(x,y)f(x+1,y+1)Gy=0f(x,y)+1f(x+1,y)+(1)f(x,y+1)+0f(x+1,y+1)=f(x+1,y)f(x,y+1)G(x,y)=Gx+Gy=f(x,y)f(x+1,y+1)+f(x+1,y)f(x,y+1)
If G ( x , y ) \mathrm{G}(\mathrm{x}, \mathrm{y})G(x,y ) is greater than a certain threshold, then we think that( x , y ) (\mathrm{x}, \mathrm{y})(x,y ) point is the edge point.

2.3 2.3 2.3 Derivation of Roberts operator

∂ f ∂ x ≈ f ( x + 1 , y ) − f ( x , y ) ≈ f ( x + 1 , y + 1 ) − f ( x , y + 1 ) ∂ f ∂ y ≈ f ( x , y + 1 ) − f ( x , y ) ≈ f ( x + 1 , y + 1 ) − f ( x + 1 , y ) ∥ ∇ f ( x , y ) ∥ = ( ∂ f ∂ x ) 2 + ( ∂ f ∂ y ) 2 ≈ ∣ ∂ f ∂ x ∣ + ∣ ∂ f ∂ y ∣ ≈ ∂ f ∂ x + ∂ f ∂ y ≈ f ( x + 1 , y ) − f ( x , y ) + f ( x + 1 , y + 1 ) − f ( x + 1 , y ) = f ( x + 1 , y + 1 ) − f ( x , y ) ≈ ∂ f ∂ x − ∂ f ∂ y ≈ f ( x + 1 , y ) − f ( x , y ) − ( f ( x , y + 1 ) − f ( x , y ) ) = f ( x + 1 , y ) − f ( x , y + 1 ) ≈ ∂ f ∂ y − ∂ f ∂ x ≈ f ( x , y + 1 ) − f ( x , y ) − ( f ( x + 1 , y ) − f ( x , y ) ) = f ( x , y + 1 ) − f ( x + 1 , y ) ≈ − ∂ f ∂ x − ∂ f ∂ y ≈ − ( f ( x + 1 , y ) − f ( x , y ) ) − ( f ( x + 1 , y + 1 ) − f ( x + 1 , y ) ) = f ( x , y ) − f ( x + 1 , y + 1 ) \begin{aligned} & \frac{\partial f}{\partial x} \approx f(x+1, y)-f(x, y) \approx f(x+1, y+1)-f(x, y+1) \\ & \frac{\partial f}{\partial y} \approx f(x, y+1)-f(x, y) \approx f(x+1, y+1)-f(x+1, y) \\ & \|\nabla f(x, y)\|=\sqrt{\left(\frac{\partial f}{\partial x}\right)^2+\left(\frac{\partial f}{\partial y}\right)^2} \approx\left|\frac{\partial f}{\partial x}\right|+\left|\frac{\partial f}{\partial y}\right| \\ & \approx \frac{\partial f}{\partial x}+\frac{\partial f}{\partial y} \approx f(x+1, y)-f(x, y)+f(x+1, y+1)-f(x+1, y)=f(x+1, y+1)-f(x, y) \\ & \approx \frac{\partial f}{\partial x}-\frac{\partial f}{\partial y} \approx f(x+1, y)-f(x, y)-(f(x, y+1)-f(x, y))=f(x+1, y)-f(x, y+1) \\ & \approx \frac{\partial f}{\partial y}-\frac{\partial f}{\partial x} \approx f(x, y+1)-f(x, y)-(f(x+1, y)-f(x, y))=f(x, y+1)-f(x+1, y) \\ & \approx-\frac{\partial f}{\partial x}-\frac{\partial f}{\partial y} \approx-(f(x+1, y)-f(x, y))-(f(x+1, y+1)-f(x+1, y))=f(x, y)-f(x+1, y+1) \end{aligned} xff(x+1,y)f(x,y)f(x+1,y+1)f(x,y+1)yff(x,y+1)f(x,y)f(x+1,y+1)f(x+1,y)∥∇f(x,y)=(xf)2+(yf)2 xf + yf xf+yff(x+1,y)f(x,y)+f(x+1,y+1)f(x+1,y)=f(x+1,y+1)f(x,y)xfyff(x+1,y)f(x,y)(f(x,y+1)f(x,y))=f(x+1,y)f(x,y+1)yfxff(x,y+1)f(x,y)(f(x+1,y)f(x,y))=f(x,y+1)f(x+1,y)xfyf(f(x+1,y)f(x,y))(f(x+1,y+1)f(x+1,y))=f(x,y)f(x+1,y+1)

2.4 Roberts算法优缺点

  • 利用局部差分算子寻找边缘,边缘定位精度高,但容易丢失一部分边缘,
  • 同时由于图像没有经过平滑处理,因此不具备抑制噪声能力。
  • 该算子对具有陡峭边缘且含有噪声少的图像处理效果较好。
  • 由于该检测算子模板没有中心点(2X2卷积模板),所以它在实际中很少 使用。

3. Prewitt 算子

3.1 Prewitt算法思想

  • Convolutional templates of size 3*3 are used (2x2 size templates are conceptually simple, but they are not very useful for calculating edge directions for templates that are symmetric about the center point, and their minimum template size is 3x3. 3x3 templates take the center into account properties of point-to-segment data and carry more information about edge directions.)
  • In terms of the definition of operators, Prewitt hopes to approximate two partial derivatives GX G_X by using two horizontal + vertical directed operators.GX, G and G_yGy, In this way, the convolution result also reaches the maximum value in areas where the gray value changes greatly.

image-20221229184300913

3.2 Prewitt algorithm steps

The Prewitt edge detection operator uses two directed operators (horizontal + ++ vertical), each approximation is a partial derivative, which is a method similar to calculating partial differential estimates,x , y \mathrm{x}, \mathrm{y}x,y 两个方向的近似检测算子为: 、
p x = { f ( i + 1 , j − 1 ) + f ( i + 1 , j ) + f ( i + 1 , j + 1 ) } − { f ( i − 1 , j − 1 ) + f ( i − 1 , j ) + f ( i − 1 , j + 1 ) } p y = { f ( i − 1 , j + 1 ) + f ( i , j + 1 ) + f ( i + 1 , j + 1 ) } − { f ( i − 1 , j − 1 ) + f ( i , j − 1 ) + f ( i + 1 , j − 1 ) } \begin{aligned} p_x= & \{f(i+1, j-1)+f(i+1, j)+f(i+1, j+1)\} \\ & -\{f(i-1, j-1)+f(i-1, j)+f(i-1, j+1)\} \\ p_y= & \{f(i-1, j+1)+f(i, j+1)+f(i+1, j+1)\} \\ & -\{f(i-1, j-1)+f(i, j-1)+f(i+1, j-1)\} \end{aligned} px=py={ f(i+1,j1)+f(i+1,j)+f(i+1,j+1)}{ f(i1,j1)+f(i1,j)+f(i1,j+1)}{ f(i1,j+1)+f(i,j+1)+f(i+1,j+1)}{ f(i1,j1)+f(i,j1)+f(i+1,j1)}
Let us set the default value:
G x = ∣ − 1 0 1 − 1 0 1 − 1 0 1 ∣ G y = ∣ − 1 − 1 − 1 0 0 0 1 1 1 ∣ G_x=\left|\begin{array }{ccc} -1&0&1\\-1&0&1\\-1&0&1\end{array}\right| \quad G_y=\left|\begin{array}{ccc}-1&-1&-1\\0&0&0\\1&1&1\end{array}\right|Gx= 111000111 Gy= 101101101
Remember image M \mathrm{M}M , gradient amplitudeT \mathrm{T}T, 比较
T = ( M ⊗ G x ) 2 + ( M ⊗ G y ) 2 >  threshold  T=\left(M \otimes G_x\right)^2+\left(M \otimes G_y\right)^2>\text { threshold } T=(MGx)2+(MGy)2> threshold 
if the finalT \mathrm{T}T is greater than the threshold, then the point is an edge point.

3.3 Advantages and disadvantages of Prewitt algorithm

  • The Prewitt operator has a better effect on image processing with grayscale gradients and more noise
  • However, false edges appearing in the detection results cannot be completely ruled out.

4. Sobel operator

4.1 Sobel algorithm idea

  • Neighboring pixels do not have equivalent effects on the current pixel, so pixels with different distances have different weights and have different effects on the operator results. Generally speaking, the farther the distance, the smaller the impact.
  • Adding a weight of 2 to the central coefficient of the Prewitt edge detection operator template can not only highlight the central pixel, but also obtain a smooth effect, which becomes the Sobel operator.

4.2 Sobel algorithm steps

Sobel operator A method that combines directional difference operations with local averaging. The operator is based on f ( x , y ) \mathrm{f}(\mathrm{x}, \mathrm{y})f(x,y ) as the center of3 × 3 3 \times 33×Calculate xx on 3 neighborhoodsxyyy 方向的偏导数, 即
p x = { f ( i + 1 , j − 1 ) + 2 f ( i + 1 , j ) + f ( i + 1 , j + 1 ) } − { f ( i − 1 , j − 1 ) + 2 f ( i − 1 , j ) + f ( i − 1 , j + 1 ) } p y = { f ( i − 1 , j + 1 ) + 2 f ( i , j + 1 ) + f ( i + 1 , j + 1 ) } − { f ( i − 1 , j − 1 ) + 2 f ( i , j − 1 ) + f ( i + 1 , j − 1 ) } \begin{aligned} p_x= & \{f(i+1, j-1)+2 f(i+1, j)+f(i+1, j+1)\} \\ & -\{f(i-1, j-1)+2 f(i-1, j)+f(i-1, j+1)\} \\ p_y= & \{f(i-1, j+1)+2 f(i, j+1)+f(i+1, j+1)\} \\ & -\{f(i-1, j-1)+2 f(i, j-1)+f(i+1, j-1)\} \end{aligned} px=py={ f(i+1,j1)+2f(i+1,j)+f(i+1,j+1)}{ f(i1,j1)+2f(i1,j)+f(i1,j+1)}{ f(i1,j+1)+2f(i,j+1)+f(i+1,j+1)}{ f(i1,j1)+2f(i,j1)+f(i+1,j1)}
Let us set the default value:
G x = ∣ − 1 0 1 − 2 0 2 − 1 0 1 ∣ G y = ∣ − 1 − 2 − 1 0 0 0 1 2 1 ∣ G_x=\left|\begin{array }{ccc} -1&0&1\\-2&0&2\\-1&0&1\end{array}\right| \quad G_y=\left|\begin{array}{ccc}-1&-2&-1\\0&0&0\\1&2&1\end{array}\right|Gx= 121000121 Gy= 101202101
Remember image M \mathrm{M}M, 梯度幅值 T \mathrm{T} T, 比较
T = ( M ⊗ G x ) 2 + ( M ⊗ G y ) 2 >  threshold  T=\left(M \otimes G_x\right)^2+\left(M \otimes G_y\right)^2>\text { threshold } T=(MGx)2+(MGy)2> threshold 
如果最终 T \mathrm{T} T 大于阈值 threshold,那么该点为边缘点。

4.3 Sobel算法优缺点

  • 优点
    • 一是高频的像素点少,低频的像素点多,使像素的灰度平均值下降且,由于噪声一般为高频信号,所以它具有抑制噪声的能力;
    • 二是检测到的边缘比较宽,至少具有两个像素的边缘宽度
  • 缺点
    • 不能完全排除检测结果中出现的虚假边缘。

4.4 *Sobel**的变种——**Istropic Sobel

Sobel 算子还有一种变种 Istropic Sobel,是各向同性 Sobel 算子,其模板为

image-20221229184843126

  • Sobel 各向同性算子的权值比普通 Sobel 算子的权值更准确。

  • 模板的权值是离中心位置越远则权值(看绝对值)影响越小,如上图,把模板看成是 9 个小正方形,小正方 形边长为 1,则虚线三角形的斜边长为√2,下直角边长为 1,则如果(0,0)位置权值绝对值 大小为 1,则按照距离关系,位置(1,0)处的权值绝对值大小应该为√2才是准确的 。

4.5 Implementation of Sobel

#include "imgFeat.h"
double feat::getSobelEdge(const Mat& imgSrc, Mat& imgDst, double thresh, int direction)
{
	cv::Mat gray;
	// 灰度图转换
    if (imgSrc.channels() == 3)
    {
        cv::cvtColor(imgSrc, gray, cv::COLOR_BGR2GRAY);
    }
    else
    {
        gray = imgSrc.clone();
    }
	int kx=0;
	int ky=0;
	if (direction == SOBEL_HORZ){
		kx = 0; ky = 1;
	}
	else if (direction == SOBEL_VERT){
		kx = 1; ky = 0;
	}
	else{
		kx = 1; ky = 1;
	}
	// mask  卷积模板
	float mask[3][3] = { { 1, 2, 1 }, { 0, 0, 0 }, { -1, -2, -1 } };
	cv::Mat y_mask = Mat(3, 3, CV_32F, mask) / 8;
	cv::Mat x_mask = y_mask.t(); //

	//
	cv::Mat sobelX, sobelY;
	filter2D(gray, sobelX, CV_32F, x_mask);
	filter2D(gray, sobelY, CV_32F, y_mask);
	sobelX = abs(sobelX);
	sobelY = abs(sobelY);
	// 计算梯度
	cv::Mat gradient = kx*sobelX.mul(sobelX) + ky*sobelY.mul(sobelY);
	//
	int scale = 4;
	double cutoff;
	// 阈值计算
	if (thresh = -1)
	{
		cutoff = scale*mean(gradient)[0];
		thresh = sqrt(cutoff);
	}
	else
	{
		cutoff = thresh * thresh;
	}

	imgDst.create(gray.size(), gray.type());
	imgDst.setTo(0);
	for (int i = 1; i<gray.rows - 1; i++)
	{
        // 数组指针
		float* sbxPtr = sobelX.ptr<float>(i);
		float* sbyPtr = sobelY.ptr<float>(i);
		float* prePtr = gradient.ptr<float>(i - 1);
		float* curPtr = gradient.ptr<float>(i);
		float* lstPtr = gradient.ptr<float>(i + 1);
		uchar* rstPtr = imgDst.ptr<uchar>(i);
		//
		for (int j = 1; j<gray.cols - 1; j++)
		{
			if (curPtr[j]>cutoff && ((sbxPtr[j]>kx*sbyPtr[j] && curPtr[j]>curPtr[j - 1] && curPtr[j]>curPtr[j + 1]) ||(sbyPtr[j]>ky*sbxPtr[j] && curPtr[j]>prePtr[j] && curPtr[j]>lstPtr[j])))
            {
                rstPtr[j] = 255;
            }
		}
	}
	return thresh;
}

image-20221229215300615

5. Kirsch operator

5.1 Kirsch algorithm idea

  • The 3*3 convolution template actually covers 8 directions (upper left, right upper,..., lower right)

  • The image is convolved using 8 3*3 templates. These 8 templates represent 8 directions, and the maximum value is taken as the edge output of the image.

5.2 Kirsch Algorithm Steps

It uses the following 8 templates to convolve and derive the derivative of each pixel on the image:
K N = [ 5 5 5 − 3 0 3 − 3 − 3 − 3 ] , K M B = [ − 3 5 5 − 3 0 5 − 3 − 3 − 3 ] K B = [ − 3 − 3 5 − 3 0 5 − 3 − 3 5 ] , K S B = [ − 3 − 3 − 3 − 3 0 5 − 3 5 5 ] K S = [ − 3 − 3 − 3 − 3 0 − 3 5 5 5 ] , K S W = [ − 3 − 3 − 3 5 0 − 3 5 5 − 3 ] K W = [ 5 − 3 − 3 5 0 − 3 5 − 3 − 3 ] , K N W = [ 5 5 − 3 5 0 − 3 − 3 − 3 − 3 ] \begin{aligned} & K_N=\left[\begin{array}{ccc} 5 & 5 & 5 \\ -3 & 0 & 3 \\ -3 & -3 & -3 \end{array}\right], \mathrm{K}_{M B}=\left[\begin{array}{ccc} -3 & 5 & 5 \\ -3 & 0 & 5 \\ -3 & -3 & -3 \end{array}\right] \\ & K_B=\left[\begin{array}{ccc} -3 & -3 & 5 \\ -3 & 0 & 5 \\ -3 & -3 & 5 \end{array}\right], \mathrm{K}_{S B}=\left[\begin{array}{ccc} -3 & -3 & -3 \\ -3 & 0 & 5 \\ -3 & 5 & 5 \end{array}\right] \\ & K_S=\left[\begin{array}{ccc} -3 & -3 & -3 \\ -3 & 0 & -3 \\ 5 & 5 & 5 \end{array}\right], \mathrm{K}_{S W}=\left[\begin{array}{ccc} -3 & -3 & -3 \\ 5 & 0 & -3 \\ 5 & 5 & -3 \end{array}\right] \\ & K_W=\left[\begin{array}{ccc} 5 & -3 & -3 \\ 5 & 0 & -3 \\ 5 & -3 & -3 \end{array}\right], \mathrm{K}_{N W}=\left[\begin{array}{ccc} 5 & 5 & -3 \\ 5 & 0 & -3 \\ -3 & -3 & -3 \end{array}\right] \end{aligned} KN= 533503533 ,KMB= 333503553 KB= 333303555 ,KSB= 333305355 KS= 335305335 ,KSW= 355305333 KW= 555303333 ,KNW= 553503333
最终选取 8 次卷积结果的最大值作为图像的边缘输出。

5.3 Kirsch算法计算优化

假设图像中一点 A 及其周围 3×3 区域的灰度如下图所示,设 q i ( i = 0 , 1 , 2...7 ) q_i(i=0,1,2...7) qi(i=0,1,2...7)为图像 经过 Kirsch 算子第 i + 1 i+1 i+1个模板处理后得到的 A 点的灰度值。

image-20221229185430085

处理后在 A 点的灰度值为:
q i = m a x { q i } ( i = 0 , 1 , 2...7 ) q_i=max\left\{q_i\right\} (i=0,1,2...7) qi=max{ qi}(i=0,1,2...7)
通过矩阵变换发现经过 Kirsch 算子得到的像素值直接的关系, 事实上需要直接由邻域 像素点计算得到的只有 p 0 p_0 p0, , 因此可以大大减少计算量。
Q = 1 / 8 ( q 0 , q 1 , ⋯   , q 1 ) T = ( r 0 , r 1 , ⋯   , r 7 ) T r 0 = 0.625 ( p 0 + p 1 + p 2 ) − 0.375 ( p 3 + p 4 + p 5 + p 6 + p 7 ) r 1 = r 0 + p 7 − p 2 r 2 = r 1 + p 6 − p 1 r 3 = r 2 + p 5 − p 0 r 4 = r 3 + p 4 − p 7 r 5 = r 4 + p 3 − p 6 r 6 = r 5 + p 2 − p 5 r 7 = r 6 + p 1 − p 4 q A = max ⁡ { q i } = 8 max ⁡ { r i } ( i = 0 , 1 , ⋯   , 7 ) \begin{aligned} & Q=1 / 8\left(q_0, q_1, \cdots, q_1\right)^T=\left(r_0, r_1, \cdots, r_7\right)^{\mathrm{T}} \\ & r_0=0.625\left(p_0+p_1+p_2\right)-0.375\left(p_3+p_4+p_5+p_6+p_7\right) \\ & r_1=r_0+p_7-p_2 \quad r_2=r_1+p_6-p_1 \quad r_3=r_2+p_5-p_0 \\ & r_4=r_3+p_4-p_7 \quad r_5=r_4+p_3-p_6 \quad r_6=r_5+p_2-p_5 \\ & r_7=r_6+p_1-p_4 \\ & \mathrm{q}_{\mathrm{A}}=\max \left\{\mathrm{q}_{\mathrm{i}}\right\}=8 \max \left\{\mathbf{r}_{\mathrm{i}}\right\} \quad(\mathrm{i}=0,1, \cdots, 7) \\ & \end{aligned} Q=1/8(q0,q1,,q1)T=(r0,r1,,r7)Tr0=0.625(p0+p1+p2)0.375(p3+p4+p5+p6+p7)r1=r0+p7p2r2=r1+p6p1r3=r2+p5p0r4=r3+p4p7r5=r4+p3p6r6=r5+p2p5r7=r6+p1p4qA=max{ qi}=8max{ ri}(i=0,1,,7)

5.4 Kirsch算法优缺点

  • Kirsch 算子属于模板匹配算子,采用八个模板来处理图像的检测图像的边缘,运算量比较大,

  • 在保持细节和抗噪声方面都有较好的效果。

5.5 *Robinson算子

Robinson 与 Kirsch 算子非常类似, 也是 8 个模板, 只是模板的内容有差异:
R N = [ 1 2 1 0 0 0 − 1 − 2 1 ] , R H B = [ 0 1 2 − 1 0 1 − 2 − 1 0 ] R B = [ − 1 0 1 − 2 0 2 − 1 0 1 ] , R S B = [ − 2 − 1 0 − 1 0 1 0 1 2 ] R S = [ − 1 − 2 − 1 0 0 0 1 2 1 ] , R S W = [ 0 − 1 − 2 1 0 − 1 2 1 0 ] R W = [ 1 0 − 1 2 0 − 2 1 0 − 1 ] , R N W = [ 2 1 0 1 0 − 1 0 − 1 − 2 ] \begin{aligned} R_N & =\left[\begin{array}{ccc} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & 1 \end{array}\right], \mathrm{R}_{H B}=\left[\begin{array}{ccc} 0 & 1 & 2 \\ -1 & 0 & 1 \\ -2 & -1 & 0 \end{array}\right] \\ R_B & =\left[\begin{array}{lll} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{array}\right], \mathrm{R}_{S B}=\left[\begin{array}{ccc} -2 & -1 & 0 \\ -1 & 0 & 1 \\ 0 & 1 & 2 \end{array}\right] \\ R_S & =\left[\begin{array}{ccc} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{array}\right], \mathrm{R}_{S W}=\left[\begin{array}{ccc} 0 & -1 & -2 \\ 1 & 0 & -1 \\ 2 & 1 & 0 \end{array}\right] \\ R_W & =\left[\begin{array}{lll} 1 & 0 & -1 \\ 2 & 0 & -2 \\ 1 & 0 & -1 \end{array}\right], \mathrm{R}_{N W}=\left[\begin{array}{ccc} 2 & 1 & 0 \\ 1 & 0 & -1 \\ 0 & -1 & -2 \end{array}\right] \end{aligned} RNRBRSRW= 101202101 ,RHB= 012101210 = 121000121 ,RSB= 210101012 = 101202101 ,RSW= 012101210 = 121000121 ,RNW= 210101012

6. *Nevitia 算子

evitia 算子采用的是 5×5 的卷积模板,它与 Kirsch 类似,也是属于方向算子,总共 能确定 4*(5-1)=12 个方向,因此包含有 12 个 5×5 的模板。 其中前 6 个模板如下,后 6 个模板可以由对称性得到。 最终选取 12 次卷积结果的最大值作为图像的边缘输出。

image-20221229190924218

第三章 二阶微分边缘算子

1. 二阶微分边缘算子基本思想

  • 边缘即是图像的一阶导数局部最大值的地方,那么也意味着 该点的二阶导数为零。二阶微分边缘检测算子就是利用图像在边缘处的阶跃性导致图像二 阶微分在边缘处出现零值这一特性进行边缘检测的。

  • 对于图像的二阶微分可以用拉普拉斯算子来表示:

∇ 2 I = ∂ 2 I ∂ x 2 + ∂ 2 I ∂ y 2 \nabla^2 I=\frac{\partial^2 I}{\partial x^2}+\frac{\partial^2 I}{\partial y^2} 2I _=x22 I+y22 I

We are at pixel point (i, j) (\mathrm{i}, \mathrm{j})(i,j ) of3 × 3 3 \times 33×3 的邻域内, 可以有如下的近似:
∂ 2 I ∂ x 2 = I ( i , j + 1 ) − 2 I ( i , j ) + I ( i , j − 1 ) ∂ 2 I ∂ y 2 = I ( i + 1 , j ) − 2 I ( i , j ) + I ( i − 1 , j ) ∇ 2 I = − 4 I ( i , j ) + I ( i , j + 1 ) + I ( i , j − 1 ) + I ( i + 1 , j ) + I ( i − 1 , j ) \begin{gathered} \frac{\partial^2 I}{\partial x^2}=I(i, j+1)-2 I(i, j)+I(i, j-1) \\ \frac{\partial^2 I}{\partial y^2}=I(i+1, j)-2 I(i, j)+I(i-1, j) \\ \nabla^2 I=-4 I(i, j)+I(i, j+1)+I(i, j-1)+I(i+1, j)+I(i-1, j) \end{gathered} x22 I=I(i,j+1)2 I ( and ,j)+I(i,j1)y22 I=I(i+1,j)2 I ( and ,j)+I(i1,j)2I _=4I(i,j)+I(i,j+1)+I(i,j1)+I(i+1,j)+I(i1,j)
对应的二阶微分卷积核为:
m = [ 0 1 0 1 4 1 0 1 0 ] \boldsymbol{m}=\left[\begin{array}{lll} 0 & 1 & 0 \\ 1 & 4 & 1 \\ 0 & 1 & 0 \end{array}\right] m= 010141010
所以二阶微分检测边缘的方法就分两步:

  1. 用上面的 Laplace 核与图像进行卷积;
  2. 对卷积后的图像, 取得那些卷积结果为 0 的点。

2. Laplace 算子

2.1 拉普拉斯表达式

Laplace (拉普拉斯) 算子是最简单的各向同性微分算子, 一个二维图像函数的拉普拉 斯变换是各向同性的二阶导数。

Laplace solution:
∇ 2 f ( x , y ) = ∂ 2 f ( x , y ) ∂ x 2 + ∂ 2 f ( x , y ) ∂ y 2 \nabla^2 f(x, y)= \frac{\partial^2 f(x, y)}{\partial x^2}+\frac{\partial^2 f(x, y)}{\partial y^2}2f(x,y)=x22f(x,y)+y22f(x,y)
We know that the expression of the gradient is ∇ = ∂ ∂ xi ⃗ + ∂ ∂ yj ⃗ \nabla=\frac{\partial}{\partial x} \vec{i}+\frac{\partial}{\partial y} \ vec{j}=xi +yj , 于是上式的推导过程就是:
∇ 2 ≜ ∇ ⋅ ∇ = ( ∂ ∂ x i ⃗ + ∂ ∂ y j ⃗ ) ⋅ ( ∂ ∂ x i ⃗ + ∂ ∂ y j ⃗ ) = ∂ 2 ∂ x 2 + ∂ 2 ∂ y 2 \nabla^2 \triangleq \nabla \cdot \nabla=\left(\frac{\partial}{\partial x} \vec{i}+\frac{\partial}{\partial y} \vec{j}\right) \cdot\left(\frac{\partial}{\partial x} \vec{i}+\frac{\partial}{\partial y} \vec{j}\right)=\frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2} 2=(xi +yj )(xi +yj )=x22+y22
2.2 图像中的Laplace算子

Considering that the image is a discrete two-dimensional matrix, using difference approximation to differentiate can be obtained:
∂ 2 f ∂ x 2 = ∂ G ∂ x = ∂ [ f ( i , j ) − f ( i , j − 1 ) ] ∂ x = ∂ f ( i , j ) ∂ x − ∂ f ( i , j − 1 ) ∂ x = [ f ( i , j + 1 ) − f ( i , j ) ] − [ f ( i , j ) − f ( i , j − 1 ) ] = f ( i , j + 1 ) − 2 f ( i , j ) + f ( i , j − 1 ) \begin{aligned} & \frac{\partial^2 f}{\ partial x^2}=\frac{\partial G}{\partial x}=\frac{\partial[\mathrm{f}(\mathbf{i}, \mathrm{j})-\mathrm{f}( \mathrm{i}, \mathrm{j}-1)]}{\partial x}=\frac{\partial \mathrm{f}(\mathrm{i}, \mathrm{j})}{\partial x }-\frac{\partial \mathrm{f}(\mathrm{i}, \mathrm{j}-1)}{\partial x} \\ & =[f(i, j+1)-f(i , j)]-[f(i, j)-f(i, j-1)] \\ & =f(i, j+1)-2 f(i, j)+f(i, j-1 ) \\ & \end{aligned}x22 f=xG=x[f(i,j)f(i,j1)]=xf(i,j)xf(i,j1)=[f(i,j+1)f(i,j)][f(i,j)f(i,j1)]=f(i,j+1)2f(i,j)+f(i,j1)
同理, 可得:
∂ f 2 ∂ x 2 = f ( i + 1 , j ) − 2 f ( i , j ) + f ( i − 1 , j ) \frac{\partial_f^2}{\partial x^2}=f(i+1, j)-2 f(i, j)+f(i-1, j) x2f2=f(i+1,j)2f(i,j)+f(i1,j)
于是有:
∇ 2 f = ∂ f 2 ∂ x 2 + ∂ f 2 ∂ x 2 = f ( i + 1 , j ) + f ( i − 1 , j ) + f ( i , j + 1 ) + f ( i , j − 1 ) − 4 f ( i , j ) \nabla^2 \mathrm{f}=\frac{\partial_f^2}{\partial x^2}+\frac{\partial_f^2}{\partial x^2}=f(i+1, j)+f(i-1, j)+f(i, j+1)+f(i, j-1)-4 f(i, j) 2f _=x2f2+x2f2=f(i+1,j)+f(i1,j)+f(i,j+1)+f(i,j1)4 f ( i ,j )
is represented by a template:
[ 0 1 0 1 − 4 1 0 1 0 ] \left[\begin{array}{ccc} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \end{array}\right] 010141010
There is also a commonly used convolution template:
[ − 1 − 1 − 1 − 1 8 − 1 − 1 − 1 − 1 ] \left[\begin{array}{ccc} -1 & -1 & -1 \\ -1 & 8 & -1 \\ -1 & -1 & -1 \end{array}\right] 111181111
Sometimes in order to get a larger weight at the center of the neighborhood, we also use the following convolution template:
[ 1 4 1 4 − 20 1 1 4 1 ] \left[\begin{array}{ccc} 1 & 4 & 1 \\ 4 & -20 & 1 \\ 1 & 4 & 1 \end{array}\right] 1414204111

2.3 Laplace algorithm process

  1. Traverse the image (remove edges to prevent cross-border), and perform Laplancian template convolution operation on each pixel. Note that only one template operation is performed, not two.

  2. Copy to target image, end.

2.4 Proof of rotation invariance of Laplace operator

∇ 2 f = ∂ 2 f ∂ x ′ 2 + ∂ 2 f ∂ y ′ 2 = ∂ ∂ x ′ ( ∂ f ∂ x ′ ) + ∂ ∂ y ′ ( ∂ f ∂ y ′ ) = ∂ ∂ x ′ ( ∂ f ∂ x ∂ x ∂ x ′ + ∂ f ∂ y ∂ y ∂ x ′ ) + ∂ ∂ y ′ ( ∂ f ∂ x ∂ x ∂ y ′ + ∂ f ∂ y ∂ y ∂ y ′ ) = ∂ ∂ x ′ ( ∂ f ∂ x cos ⁡ θ + ∂ f ∂ y sin ⁡ θ ) + ∂ ∂ y ′ ( − sin ⁡ θ ∂ f ∂ x + cos ⁡ θ ∂ f ∂ y ) = ∂ ∂ x ( ∂ f ∂ x cos ⁡ θ + ∂ f ∂ y sin ⁡ θ ) ∂ x ∂ x ′ + ∂ ∂ y ( ∂ f ∂ x cos ⁡ θ + ∂ f ∂ y sin ⁡ θ ) ∂ y ∂ x ′ + ∂ ∂ x ( − sin ⁡ θ ∂ f ∂ x + cos ⁡ ∂ f ∂ y ) ∂ x ∂ y ′ + ∂ ∂ y ( − sin ⁡ θ ∂ f ∂ x + cos ⁡ ∂ f ∂ y ) ∂ y ∂ y ′ ∂ ∂ x ( ∂ f ∂ x cos ⁡ θ + ∂ f ∂ y sin ⁡ θ ) cos ⁡ θ + ∂ ∂ y ( ∂ f ∂ x cos ⁡ θ + ∂ f ∂ y sin ⁡ θ ) sin ⁡ θ + ∂ ∂ x ( − sin ⁡ θ ∂ f ∂ x + cos ⁡ ∂ f ∂ y ) ( − sin ⁡ θ ) + ∂ ∂ y ( − sin ⁡ θ ∂ f ∂ x + cos ⁡ θ ∂ f ∂ y ) cos ⁡ θ = ∂ ∂ x ∂ f ∂ x + ∂ ∂ y ∂ f ∂ y = ∂ 2 f ∂ x 2 + ∂ 2 f ∂ y 2 \begin{aligned} \nabla^2 f & =\frac{\partial^2 f}{\partial x^{\prime 2}}+\frac{\partial^2 f}{\partial y^{\prime 2}} \\ & =\frac{\partial}{\partial x^{\prime}}\left(\frac{\partial f}{\partial x^{\prime}}\right)+\frac{\partial}{\partial y^{\prime}}\left(\frac{\partial f}{\partial y^{\prime}}\right) \\ = & \frac{\partial}{\partial x^{\prime}}\left(\frac{\partial f}{\partial x} \frac{\partial x}{\partial x^{\prime}}+\frac{\partial f}{\partial y} \frac{\partial y}{\partial x^{\prime}}\right)+\frac{\partial}{\partial y^{\prime}}\left(\frac{\partial f}{\partial x} \frac{\partial x}{\partial y^{\prime}}+\frac{\partial f}{\partial y} \frac{\partial y}{\partial y^{\prime}}\right) \\ = & \frac{\partial}{\partial x^{\prime}}\left(\frac{\partial f}{\partial x} \cos \theta+\frac{\partial f}{\partial y} \sin \theta\right)+\frac{\partial}{\partial y^{\prime}}\left(-\sin \theta \frac{\partial f}{\partial x}+\cos \theta \frac{\partial f}{\partial y}\right) \\ = & \frac{\partial}{\partial x}\left(\frac{\partial f}{\partial x} \cos \theta+\frac{\partial f}{\partial y} \sin \theta\right) \frac{\partial x}{\partial x^{\prime}}+\frac{\partial}{\partial y}\left(\frac{\partial f}{\partial x} \cos \theta+\frac{\partial f}{\partial y} \sin \theta\right) \frac{\partial y}{\partial x^{\prime}} \\ & \quad+\frac{\partial}{\partial x}\left(-\sin \theta \frac{\partial f}{\partial x}+\cos \frac{\partial f}{\partial y}\right) \frac{\partial x}{\partial y^{\prime}}+\frac{\partial}{\partial y}\left(-\sin \theta \frac{\partial f}{\partial x}+\cos \frac{\partial f}{\partial y}\right) \frac{\partial y}{\partial y^{\prime}} \\ & \frac{\partial}{\partial x}\left(\frac{\partial f}{\partial x} \cos \theta+\frac{\partial f}{\partial y} \sin \theta\right) \cos \theta+\frac{\partial}{\partial y}\left(\frac{\partial f}{\partial x} \cos \theta+\frac{\partial f}{\partial y} \sin \theta\right) \sin \theta \\ & \quad+\frac{\partial}{\partial x}\left(-\sin \theta \frac{\partial f}{\partial x}+\cos \frac{\partial f}{\partial y}\right)(-\sin \theta)+\frac{\partial}{\partial y}\left(-\sin \theta \frac{\partial f}{\partial x}+\cos \theta \frac{\partial f}{\partial y}\right) \cos \theta \\ = & \frac{\partial}{\partial x} \frac{\partial f}{\partial x}+\frac{\partial}{\partial y} \frac{\partial f}{\partial y} \\ = & \frac{\partial^2 f}{\partial x^2}+\frac{\partial^2 f}{\partial y^2} \end{aligned} 2f _======x′22 f+y′22 f=x(xf)+y(yf)x(xfxx+yfxy)+y(xfyx+yfyy)x(xfcosi+yfsini )+y(sinixf+cosiyf)x(xfcosi+yfsini )xx+y(xfcosi+yfsini )xy+x(sinixf+cosyf)yx+y(sinixf+cosyf)yyx(xfcosi+yfsini )cosi+y(xfcosi+yfsini )sini+x(sinixf+cosyf)(sini )+y(sinixf+cosiyf)cosixxf+yyfx22 f+y22 f

2.4 Advantages and disadvantages of Laplace operator

The Laplace operator has rotation invariance and is suitable for situations where only the specific position of the edge point is focused, but there is no requirement for the actual grayscale difference near the edge point. However, the Laplacian operator is not commonly used in edge detection. It is mainly used to determine whether the pixel is on the bright side or the dark side of the edge . There are three main reasons:

  • Compared with the first-order derivative operator, the second-order derivative operator has weaker ability to remove noise;

  • The operator cannot detect the direction of the edge;

  • The magnitude of this operator produces bilateral bins.

In order to improve the anti-noise ability of the image, people have proposed the improved LOG (Laplacian of Gaussian) operator which will be introduced in the next section.

3. LOG operator

3.1 Overview of LoG operator

  • First, use Gaussian filter to smooth the image to remove noise, and then use Laplace operator to detect image edges.
  • This not only achieves the effect of reducing noise, but also smoothes and extends the edges. In order to prevent unnecessary edges from being obtained, the edge points should be selected as first-order derivative zero-crossing points higher than a certain threshold.
  • The LOG operator has become currently the best operator for detecting step edges using second-order derivative zero crossings.

3.2 Problems solved by LoG

Using the Laplace algorithm directly, the denoising ability is very weak, and the edge information finally extracted is easily interfered by noise, as shown in the following figure:

image-20221229194312523

The most important problem solved by LoG is to add a layer of Gaussian filter to the original image before extracting edge information, so that the gradient abnormal points of the original image are greatly reduced, and ultimately the edge information can be extracted more effectively.

3.3 Calculation process of LoG operator

In image processing, the commonly used two-dimensional Gaussian function is:
G ( x , y , σ ) = 1 2 π σ 2 e − ( x 2 + y 2 ) / 2 σ 2 G(x, y, \sigma)= \frac{1}{2 \pi \sigma^2} e^{-\left(x^2+y^2\right) / 2 \sigma^2}G(x,y,s )=2 p.s _21e(x2+y2 )/2p2The
Laplacian operator is
∇ 2 f = ∂ 2 f ∂ x 2 + ∂ 2 f ∂ y 2 \nabla^2 f=\frac{\partial^2 f}{\partial x^2}+\frac {\partial^2 f}{\partial y^2}2f _=x22 f+y22 f
Applying the Laplace operator to the two-dimensional Gaussian function gives
∇ 2 G = ∂ 2 G ∂ x 2 + ∂ 2 G ∂ y 2 = − 2 σ 2 + x 2 + y 2 2 π σ 6 e − ( x 2 + y 2 ) / 2 σ 2 \nabla^2 G=\frac{\partial^2 G}{\partial x^2}+\frac{\partial^2 G}{\partial y^2}=\frac {-2 \sigma^2+x^2+y^2}{2 \pi \sigma^6} e^{-\left(x^2+y^2\right) / 2 \sigma^2}2G=x22G+y22G=2 p.s _62 p2+x2+y2e(x2+y2 )/2p2
LoG (Laplacian of Gaussian) operator is defined as
L o G = σ 2 ∇ 2 GL o G=\sigma^2 \nabla^2 GLoG=p22 G
noticed that the LoG operator, the function is axially symmetric, and the average value of the function iso \mathrm{o}o (within the definition domain), so using it to perform convolution calculations with the image will not change the complete dynamic area of ​​the image, but will blur the image (because it is quite smooth), and the degree of blur is the same as σ \sigmaσ is proportional to σ.

  • σ \sigmaWhen σ is large, Gaussian filtering has a good smoothing effect and suppresses noise to a great extent, but at the same time some edge details are lost and the positioning accuracy of the image edge is reduced;
  • σ \sigmaWhen σ is relatively small, it has strong image edge positioning accuracy, but the signal-to-noise ratio is low.
  • How to choose the appropriate σ \sigma when applying the LOG operatorThe σ value is very important and depends on the edge positioning accuracy requirements and noise conditions.

3.4 LoG convolution template

The convolution template of the LOG operator usually uses a 5×5 matrix, such as:

image-20221229195527068

Calculation code process:

image-20221229220224905

The template on the left is an approximation when the Gaussian standard deviation is 0.5, and the Gaussian standard deviation of the template on the right is 1.

Normalize is a regular function that ensures that the sum of the template coefficients is 1. so that no edges are detected in areas of uniform brightness. Finally, since integers are easier to calculate than floating point numbers, the coefficients of the template will be approximated as integers.

3.5 LoG algorithm process

(1) Traverse the image (remove edges to prevent cross-border), and perform Gauss-Laplancian template convolution operation on each pixel.

(2) Copy to the target image, end.

3.6 DoG and LoG

Simplify the calculation of LOG - this is the DOG operator. The DoG (Difference of Gaussian) operator is defined as:
D o G = G ( x , y , σ 1 ) − G ( x , y , σ 2 ) D o G=G\left(x, y, \sigma_1\right )-G\left(x, y, \sigma_2\right)DoG=G(x,y,p1)G(x,y,p2)
Below we will prove why DoG can approximately replace LoG.
For two-dimensional Gaussian functions aboutσ \sigmaG ∂ σ = − 2 σ 2 + x 2 + y 2 2 π σ 5 e − ( x 2 + y 2 ) / 2 σ 2 \frac{\partial G}{\partial \
sigma}=\frac{-2\sigma^2+x^2+y^2}{2\pi\sigma^5}e^{-\left(x^2+y^2\right)/2\ sigma^2}σG=2 p.s _52 p2+x2+y2e(x2+y2 )/2p2It
is not difficult to find that
∂ G ∂ σ = σ ∇ 2 G \frac{\partial G}{\partial \sigma}=\sigma \nabla^2 GσG=σ2 G
在 DoG 算子中, 令σ 1 = k σ 2 = k σ \sigma 1=\mathrm{k} \sigma 2=\mathrm{k \sigma}p 1=k σ 2=k σ , ∂D
o G = G ( x , y , k σ ) − G ( x , y , σ ) D o G=G(x, y, k \sigma)-G(x, y, \sigma)DoG=G(x,y,ks )G(x,y,σ)
进一步地
∂ G ∂ σ = lim ⁡ Δ σ → 0 G ( x , y , σ + Δ σ ) − G ( x , y , σ ) ( σ + Δ σ ) − σ ≈ G ( x , y , k σ ) − G ( x , y , σ ) k σ − σ \frac{\partial G}{\partial \sigma}=\lim _{\Delta \sigma \rightarrow 0} \frac{G(x, y, \sigma+\Delta \sigma)-G(x, y, \sigma)}{(\sigma+\Delta \sigma)-\sigma} \approx \frac{G(x, y, k \sigma)-G(x, y, \sigma)}{k \sigma-\sigma} σG=Δσ0lim(σ+Δσ)σG(x,y,σ+Δσ)G(x,y,σ)σG(x,y,)G(x,y,σ)
Thereforeσ
∇ 2 G = ∂ G ∂ σ ≈ G ( x , y , k σ ) − G ( x , y , σ ) k σ − σ \sigma \nabla^2 G=\frac{\partial G}{\ partial \sigma} \approx \frac{G(x, y, k \sigma)-G(x, y, \sigma)}{k \sigma-\sigma}σ2G=σGMrpG(x,y,ks )G(x,y,p ).

G ( x , y , k σ ) − G ( x , y , σ ) ≈ ( k − 1 ) σ 2 ∇ 2 GG(x, y, k \sigma)-G(x, y, \sigma) \ approx(k-1) \sigma^2 \nabla^2 GG(x,y,ks )G(x,y,s )(k1 ) p22G
这表明 DoG 算子可以近似于 LoG 算子, 证毕。
于是, 使用 DoG 算子, 让我们只需对图像进行两次高斯平滑再将结果相减就可以近似 得到 LOG 作用于图像的效果了。
值得注意的是, 当我们用 DOG 算子代替 LOG 算子与图像卷积的时候:
DoG ⁡ ( x , y , σ 1 , σ 2 ) ∗ I ( x , y ) = G ( x , y , σ 1 ) ∗ I ( x , y ) − G ( x , y , σ 2 ) ∗ I ( x , y ) \operatorname{DoG}\left(x, y, \sigma_1, \sigma_2\right) * I(x, y)=G\left(x, y, \sigma_1\right) * I(x, y)-G\left(x, y, \sigma_2\right) * I(x, y) DoG(x,y,σ1,σ2)I(x,y)=G(x,y,σ1)I(x,y)G(x,y,σ2)I(x,y)
近似的 LOG 算子值的选取:
σ 2 = σ 1 2 σ 2 2 σ 1 2 − σ 2 2 ln ⁡ [ σ 1 2 σ 2 2 ] \sigma^2=\frac{\sigma_1^2 \sigma_2^2}{\sigma_1^2-\sigma_2^2} \ln \left[\frac{\sigma_1^2}{\sigma_2^2}\right] σ2=σ12σ22σ12σ22ln[σ22σ12]
当使用这个值时, 可以保证 LoG 和 DoG 的过零点相同, 只是幅度大小不同。

3.7 LoG算子优缺点

  • 优点
    • LoG 这种方法寻找二阶导数是很稳定的。
    • 高斯平滑有效地抑制了距离当前像素 3σ 范围内的所有像素的影响,这样 Laplace 算子就构成了一种反映图像变化的有效 而稳定的度量。
  • 传统的二阶导数过零点技术也有缺点。
    • 第一,对形状作了过分的平滑
    • 第二, 它有产生环形边缘的倾向。

4. Canny 算子

4.1 Canny算子概述

  • 基于图像梯度计算的边缘检测算法
  • Canny 根据以前的边缘检测算子以及应用,归纳了如下三条准则:
    1. 信噪比准则:避免真实的边缘丢失,避免把非边缘点错判为边缘点;
    2. 定位精度准则:得到的边缘要尽量与真实边缘接近;
    3. 单一边缘响应准则:单一边缘需要具有独一无二的响应,要避免出现多个响应,并最大抑制虚假响应。

The above three criteria were first clearly proposed by Canny and completely solved this problem, although others had made similar requirements before him. More importantly, Canny also provides their mathematical expressions (now taking one dimension as an example), which turns it into a functional optimization problem.

4.2 Canny operator detection steps

The classic Canny edge detection algorithm usually starts with Gaussian blur and ends with edge connection based on dual thresholds. However, in actual engineering applications, considering that the input images are all color images, the final image after edge connection must be binary output and displayed, so the complete Canny edge detection algorithm implementation steps are as follows:

  1. Convert color image to grayscale image
  2. Gaussian blur the image
  3. Calculate the image gradient, and calculate the image edge amplitude and angle based on the gradient
  4. Non-maximum suppression (edge ​​refinement)
  5. Dual threshold detection
  6. Complete edge detection by suppressing isolated weak edges
  7. Binarized image output result

4.3 Detailed explanation of each step of Canny operator

  1. Convert color image to grayscale image

    According to the formula for converting color image RGB to grayscale: gray = R ∗ 0.299 + G ∗ 0.587 + B ∗ 0.114 gray = R * 0.299 + G * 0.587 + B * 0.114gray=R0.299+G0.587+B0.114

  2. Gaussian blur the image

    In order to reduce the influence of noise on the edge detection results as much as possible, it is necessary to filter the noise to prevent false detection caused by noise. To smooth the image, the image is convolved with a Gaussian filter, this step will smooth the image to reduce the effect of noise evident on the edge detector. The size is ( 2 k + 1 ) ∗ ( 2 k + 1 ) (2k+1)*(2k+1)( 2k _+1)( 2k _+The generating equation of the Gaussian filter kernel of 1 ) is given by:
    H ij = 1 2 π σ 2 exp ⁡ ( − ( i − ( k + 1 ) ) 2 + ( j − ( k + 1 ) ) 2 2 σ 2 ) ; 1 ≤ i , j ≤ ( 2 k + 1 ) ( 3 − 1 ) H_{ij}=\frac{1}{2 \pi \sigma^2} \exp \left(-\frac{ (i-(k+1))^2+(j-(k+1))^2}{2 \sigma^2}\right) ; 1 \leq i, j \leq(2 k+1) \ quad(3-1)Hij=2 p.s _21exp(2 p2(i(k+1))2+(j(k+1))2);1i,j( 2k _+1)(31 )
    Below is a sigma= 1.4 = 1.4=1.4 , the size is3 × 3 3 \times 33×3 的高斯卷积核的例子(需要注意归一化):
    H = [ 0.0924 0.1192 0.0924 0.1192 0.1538 0.1192 0.0924 0.1192 0.0924 ] H=\left[\begin{array}{lll} 0.0924 & 0.1192 & 0.0924 \\ 0.1192 & 0.1538 & 0.1192 \\ 0.0924 & 0.1192 & 0.0924 \end{array}\right] H= 0.09240.11920.09240.11920.15380.11920.09240.11920.0924
    若图像中一个 3 × 3 3 \times 3 3×3 的窗口为 A \mathrm{A} A, 要滤波的像素点为 e \mathrm{e} e, 则经过高斯滤波之后, 像素点 e \mathrm{e} e 的亮度值为:
    e = H ∗ A = [ h 11 h 12 h 13 h 21 h 22 h 23 h 31 h 32 h 33 ] ∗ [ abcdefghi ] = sum ⁡ ( [ a × h 11 b × h 12 c × h 13 d × h 21 e × h 22 f × h 23 g × h 31 h × h 32 i × h 33 ] ) e=H * A=\left[\begin{array}{lll} \mathrm{h}_{11} & \mathrm{~h}_{12} & \mathrm{~h}_{13}\\mathrm{~h}_{21} & \mathrm{~h}_{22} & \mathrm{~h }_{23} \\ \mathrm{~h}_{31} & \mathrm{~h}_{32} & \mathrm{~h}_{33}\end{array}\right] *\left [\begin{array}{ccc}a&b&c\\d&e&f\\g&h&i\end{array}\right]=\operatorname{sum}\left(\left[\begin {array}{lll} \mathrm{a}\times\mathrm{h}_{11} & \mathrm{~b}\times\mathrm{h}_{12} & \mathrm{c}\times\mathrm {h}_{13}\\\mathrm{~d}\times\mathrm{h}_{21}&\mathrm{e}\times\mathrm{h}_{22}&\mathrm{f}\ times \mathrm{h}_{23}\\\mathrm{~g}\times\mathrm{h}_{31} & \mathrm{~h}\times\mathrm{h}_{32} & \mathrm{i}\times\mathrm{h}_{33}\end{array }\right]\right)e=HA= h11 h21 h31 h12 h22 h32 h13 h23 h33 adgbehcfi =sum a×h11 d×h21 g×h31 b×h12e×h22 h×h32c×h13f×h23i×h33
    Among them, * is the convolution symbol, and sum means the sum of all elements in the matrix.
    It is important to understand that the choice of Gaussian convolution kernel size will affect the performance of the Canny detector.

    • The larger the size, the less sensitive the detector is to noise, but the positioning error of edge detection will also increase slightly.
    • Generally 5 × 5 5 \times 55×5 is a relatively good trade off.
  3. Calculate the image gradient, and calculate the image edge amplitude and angle based on the gradient

    Edges in an image can point in various directions, so the Canny algorithm uses four operators to detect horizontal, vertical and diagonal edges in an image. Edge detection operators (such as Roberts, Prewitt, Sobel, etc.) return the level G x \mathrm{Gx}Gx and verticalG y GyG The first derivative value in the y direction, from which the gradient G of the pixel point can be determined \mathrm{G}G also theta
    G = G x 2 + G y 2 θ = arctan ⁡ ( G y / G x ) \begin{aligned} & G=\sqrt{G_x^2+G_y^2} \\ & \theta= \arctan \left(G_y/G_x\right)\end{aligned}G=Gx2+Gy2 i=arctan(Gy/Gx)
    where G \mathrm{G}G is the gradient strength, theta represents the gradient direction,arctan ⁡ \arctanarctan is the arctangent function.

  4. Non-maximum suppression (edge ​​refinement)

    Non-maximum suppression is an edge sparse technology, and the function of non-maximum suppression is to "thin" edges. After gradient calculation is performed on the image, the edges extracted based only on the gradient value are still blurry. For criterion 3, there is and should be only one accurate response to the edge. Non-maximum suppression can help to suppress all gradient values ​​other than the local maximum to 0. The algorithm for non-maximum suppression for each pixel in the gradient image is: 1) Combine the gradient strength of the current pixel with the edge Compare two pixels in the positive and negative gradient directions. 2) If the gradient intensity of the current pixel is the largest compared with the other two pixels, the pixel is retained as an edge point, otherwise the pixel will be suppressed.

    Usually for more accurate calculations, linear interpolation is used between two adjacent pixels across the gradient direction to obtain the pixel gradient to be compared. Here is an example:

    image-20221229202456178

    As shown in the figure above, the gradient is divided into 8 directions, namely E, NE, N, NW, W, SW, S, SE, where

    0 represents 0°~45°, 1 represents 45°~90°, 2 represents -90°~-45°, 3 represents -45°~0°. The gradient direction of pixel point P is theta, then the gradient linear interpolation of pixel points P1 and P2 is:

    \mathrm{o}o representO∘ ∼ 4 5 ∘ , 1 \mathrm{O}^{\circ} \sim 45^{\circ}, 1O45,1 represents4 5 ∘ ∼ 9 0 ∘ , 2 45^{\circ} \sim 90^{\circ}, 24590,2 represents− 9 0 ∘ ∼ − 4 5 ∘ , 3 -90^{\circ} \sim-45^{\circ}, 39045,3 stands for− 4 5 ∘ ∼ 0 ∘ -45^{\circ} \sim 0^{\circ}450 . Pixel pointP \mathrm{P}The gradient direction of P is theta, then pixel points P1 andP 2 \mathrm{P} 2P2 的梯度线性揷值为:
    tan ⁡ ( θ ) = G y / G x G p 1 = ( 1 − tan ⁡ ( θ ) ) × E + tan ⁡ ( θ ) × N E G p 2 = ( 1 − tan ⁡ ( θ ) ) × W + tan ⁡ ( θ ) × S W \begin{aligned} & \tan (\theta)=G_y / G_x \\ & G_{p 1}=(1-\tan (\theta)) \times E+\tan (\theta) \times N E \\ & G_{p 2}=(1-\tan (\theta)) \times W+\tan (\theta) \times S W \end{aligned} tan(θ)=Gy/GxGp1=(1tan(θ))×E+tan(θ)×NEGp2=(1tan(θ))×W+tan(θ)×SW
    因此非极大值抑制的伪代码描写如下:

    if $G_p \geq G_{p 1}$ and $G_p \geq G_{p 2}$
    $G_p$ may be an edge
    else
    $G_p$ should be sup pressed
    

    备注: 如何标志方向并不重要, 重要的是梯度方向的计算要和梯度算子的选取保持一致。

  5. 双阈值检测

    After applying non-maximum suppression, the remaining pixels can more accurately represent the actual edges in the image. However, there are still some edge pixels due to noise and color variation. To resolve these spurious responses, it is necessary to filter edge pixels with weak gradient values ​​and keep edge pixels with high gradient values, which can be achieved by choosing high and low thresholds. If the gradient value of an edge pixel is higher than the high threshold, it will be marked as a strong edge pixel; if the gradient value of an edge pixel is less than the high threshold and greater than the low threshold, it will be marked as a weak edge pixel; if the gradient value of an edge pixel is less than If the threshold is low, it will be suppressed. The choice of threshold depends on the content of the given input image.

    The pseudo-code of double-threshold detection is described as follows:

    image-20221229220546777

    6**.** Edge detection is done by suppressing isolated weak edges

    Pixels classified as strong edges have been identified as edges because they are extracted from real edges in the image. However, for weak edge pixels, there will be some debate, since these pixels can be extracted from real edges or caused by noise or color changes. To obtain accurate results, weak edges caused by the latter should be suppressed. Typically, weak edge pixels caused by real edges will be connected to strong edge pixels, while noise responses are not connected. In order to track the edge connection, by looking at the weak edge pixel and its 8 neighboring pixels, as long as one of them is a strong edge pixel, the weak edge point can be retained as a real edge.

    The pseudocode for suppressing isolated edge points is described as follows:

    image-20221229203911577

    1. Binarized image output result

    Binarize the result and output it, that is, set the gray value of the pixels on the image to 0 or 255, which is the process of making the entire image present an obvious black and white effect.

4.4 Advantages and Disadvantages of Canny Operator

  • The Canny operator is better than the LOG operator in terms of detection effect, positioning performance and anti-noise performance in two-dimensional space.
  • The disadvantage of the Canny operator is that it will blur the edges of the image for noise-free images.
  • In order to make the detection effect better, we generally choose a slightly larger filter scale when using this operator , but doing so can easily cause some edge details of the image to be lost.

5. Comparison of edge detection operators

image-20221229204308168

Chapter 4 SUSAN edge and corner detection method

1 Overview of SUSAN detection methods

The SUSAN detection method is a detection method based on a window template , mainly by establishing a window at the position of each pixel of the image. value or Gaussian weight, in general, the window radius is 3.4 pixels (that is, there are always 37 pixels in the window). Such a window template is placed at the position of each pixel, and the intensity similarity between determined points can be described by the following figure, where the x-axis refers to the intensity difference between pixels, and the y-axis refers to the similarity degree, which is 1 It means completely similar.

image-20221229204609721

image-20221229204640143

Considering that for the sake of simple calculation, the line a (above) is generally used. Finally, the number of points similar to the center point (that is, the point with a similarity degree of 1) is counted. The area where the similar point is located is called USAN (Univalue Segment Assimilating Nucleus), and The feature edge or corner point is the local minimum point value. The cut-off point in the figure (the pixel difference that distinguishes whether it is similar) actually reflects the minimum contrast degree of image features and the maximum number of excluded noises.

  • Edge and corner detection does not use the gradient (derivative) of image pixels. For this reason, it can perform well even in the presence of noise. Because as long as the noise is small enough and does not contain all similar pixel points, the noise can be eliminated. During calculation, the set of single values ​​uniformly eliminates the influence of noise and has good robustness to noise.

  • By comparing the similarity between the pixel neighborhood and its center, it also has the invariance of light intensity (because the pixel difference will not change), and the rotation invariance (rotation will not change the similarity between local pixels), and a certain The degree of scale invariance (the angle of the corner point will not change when the scale is enlarged, and the certain degree here means that the local curvature will gradually smooth when the scale is enlarged).

  • The parameters used are also very few, so the calculation and storage requirements are low.

  • It is mainly used for edge detection and corner detection, and its high robustness to noise will also be used in noise elimination to select the best local smooth neighborhood (the most similar points) The place). This article focuses on describing the idea of ​​the SUSAN method for edge detection and corner detection, and briefly introduces how the SUSAN method is applied in the field of noise elimination.

2 SUSAN edge detection

SUSAN edge detection includes a total of 5 steps: calculation of edge response; calculation of edge direction; non-maximum suppression; sub-pixel accuracy and detection position. The specific detection process is as follows.

1. Calculation of edge response

First consider the circular template of the image centered on each pixel (radius 3.4 3.43.4 pixels, there are a total of 37 pixels in the template), for each neighborhood point within it, the following similarity measurement is calculated (the standard isa \mathrm{a}a line), wherer \mathrm{r}r is the length of the field pixel from the center, and ro is the center position,t \mathrm{t}t refers to the similarity cutoff value (which determines the minimum difference between the feature and the background, and the maximum number of noise that can be excluded).
c ( r ⃗ , r 0 → ) = { 1 if ∣ I ( r ⃗ ) − I ( r ⃗ 0 ) ∣ ≤ t 0 if ∣ I ( r ⃗ ) − I ( r 0 → ) ∣ > t , c\ left(\vec{r}, \overrightarrow{r_0}\right)= \begin{cases}1 & \text { if }\left|I(\vec{r})-I\left(\vec{r} _0\right)\right| \leq t \\ 0 & \text { if }\left|I(\vec{r})-I\left(\overrightarrow{r_0}\right)\right|>t,\ end{cases}c(r ,r0 )={ 10 if I(r )I(r 0)t if  I(r )I(r0 ) >t,
Of course, we can also use smooth lines to replace this direct segmentation method (as shown in the figure b \mathrm{b}Line b ), this can obtain more stable and sensitive results. Although the calculation is complex, it can obtain faster speed through the lookup table. The formula is as follows:
c ( r ⃗ , r ⃗ 0 ) = e − ( I ( r ⃗ ) − I ( r 0 → ) I ) 6 . c\left(\vec{r}, \vec{r}_0\right )=e^{-\left(\frac{I(\vec{r})-I\left(\overrightarrow{r_0}\right)}{I}\right)^6} .c(r ,r 0)=e(II(r )I(r0 ))6. And
calculate the total similarity:
n ( r ⃗ 0 ) = ∑ r ⃗ c ( r ⃗ , r 0 → ) n\left(\vec{r}_0\right)=\sum_{\vec{r} } c\left(\vec{r}, \overrightarrow{r_0}\right)n(r 0)=r c(r ,r0 )
Next, putn \mathrm{n}n with a fixed thresholdg \mathrm{g}g comparison (generally set to the maximum number of concentric similar pointsnmax \mathrm{nmax}nmax 0.750.75About 0.75 times), the initial edge response can be calculated using the following equation:
R ( r ⃗ 0 ) = { g − n ( r ⃗ 0 ) if n ( r 0 → ) < g 0 otherwise R\left(\vec{ r}_0\right)= \begin{cases}gn\left(\vec{r}_0\right) & \text { if } n\left(\overrightarrow{r_0}\right)<g \\ 0 & \ text { otherwise }\end{cases}R(r 0)={ gn(r 0)0 if n(r0 )<g otherwise 
Here g \mathrm{g}g is to eliminate the influence of noise, and whenn \mathrm{n}When n is smaller than this threshold, its edge can be considered as an edge,n \mathrm{n}The smaller n is, the larger its edge response is. If a step edge is considered, if the point is on the edge, then its USAN value should be less than or equal to0.5 0.50.5 times the maximum value, and for the edge of the curve, this value should be higher (because the curvature will be smaller, so within the window, the area within the curve will be smaller), so g \mathrm{g}The addition of g will not affect the result of the original edge detection (that is, the obtained edge responseRRR will not be 0).

2. Calculation of edge direction

  • First, non-maximum suppression needs to find the edge direction (this will be explained later)

  • Determining sub-pixel level accuracy also requires finding the edge direction (this is easy to understand)

  • Some applications may use edge direction (including position and length). If the edge length of a pixel is not 0, then there is an edge direction. There are generally two situations for edges:

    [External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-kUBumiUc-1672324092174) (C:/Users/CWF/AppData/Roaming/Typora/typora-user-images/ image-20221229210103725.png)]

One is a standard edge point (like a and b in the figure, which are just on one side or the other side of the edge), and c is another situation, which is just between the transition zones on the left and right sides, and the strength of the transition zone is exactly is half the intensity of the left and right pixels. A concept is also introduced here, the gravitational center of USAN (the USAN area is the white area in the image template, which is the area similar to the center), which can be calculated by the following formula: r ⃗ ‾ ( r ⃗ 0 ) = ∑ r ⃗
r ⃗ c ( r ⃗ , r ⃗ 0 ) ∑ r ⃗ c ( r ⃗ , r ⃗ 0 ) \overline{\vec{r}}\left(\vec{r}_0\right)=\frac{\sum_{ \vec{r}} \vec{r} c\left(\vec{r}, \vec{r}_0\right)}{\sum_{\vec{r}} c\left(\vec{r} , \vec{r}_0\right)}r (r 0)=r c(r ,r 0)r r c(r ,r 0)
where a \mathrm{a}a andb \mathrm{b}Case b can be regarded as a kind of edge between pixels, the vector between the center of gravity and the center of the template is just perpendicular to the direction of the local edge, and we can find its edge direction in this way.

In the picture above c \mathrm{c}The situation of c is completely different. Its center of gravity is exactly at one point with the center of the template (this is regarded as an inner edge of a pixel). At this time, the direction of the edge must find the direction of the longest symmetry axis in the USAN area. This It can be obtained by the following formula:
( x − x 0 ) 2 ‾ ( r 0 → ) = ∑ r ⃗ ( x − x 0 ) 2 c ( r ⃗ , r 0 → ) ( y − y 0 ) 2 ‾ ( r 0 → ) = ∑ r ⃗ ( y − y 0 ) 2 c ( r ⃗ , r ⃗ 0 ) ( x − x 0 ) ( y − y 0 ) ‾ ( r 0 → ) = ∑ r ⃗ ( x − x 0 ) ( y − y 0 ) c ( r ⃗ , r 0 → ) \begin{gathered} \overline{\left(x-x_0\right)^2}\left(\overrightarrow{r_0}\right)=\sum_ {\vec{r}}\left(x-x_0\right)^2 c\left(\vec{r}, \overrightarrow{r_0}\right) \\ \overline{\left(y-y_0\right) ^2}\left(\overrightarrow{r_0}\right)=\sum_{\vec{r}}\left(y-y_0\right)^2 c\left(\vec{r}, \vec{r} _0\right) \\ \overline{\left(x-x_0\right)\left(y-y_0\right)}\left(\overrightarrow{r_0}\right)=\sum_{\vec{r}}\ left(x-x_0\right)\left(y-y_0\right) c\left(\vec{r}, \overrightarrow{r_0}\right) \end{gathered}(xx0)2(r0 )=r (xx0)2c(r ,r0 )(yy0)2(r0 )=r (yy0)2c(r ,r 0)(xx0)(yy0)(r0 )=r (xx0)(yy0)c(r ,r0 )
The ratio of the first two terms can be used to determine the direction of the edge (degrees), while the sign of the latter term is used to determine the sign of the gradient of a sloped edge (which way it slopes).
The next question is how to automatically determine which edge each point belongs to. First, if the USAN area (number of pixels) is smaller than the template diameter (number of pixels), then this should be the inner edge of the pixel (such as c). If it is large, and then find the gravitational center of USAN, then this should be an edge case between pixels (such as a \mathrm{a}a andb \mathrm{b}b ). If the distance between the center of gravity and the center of the template is less than 1 pixel, it is more accurate to use the inner pixel edge case (such as c) to calculate it.

3. Non-maximum suppression

The CANNY operator has been introduced and will not be repeated here.

4. Sub-pixel accuracy

For each edge point, first find the edge direction, then reduce the edge width in the vertical direction of the edge, and then perform a 3-point quadratic curve fitting on the remaining edge points, and the turning point of the curve (generally the distance from the original edge point should be less than half a pixel) be considered the exact location of the edge.

5. The detection position does not depend on the window size

That is, increasing the window template will not change the position of the detected edge, that is, SUSAN has scale invariance. In addition, its detection repetition rate for images after the viewpoint is changed is also very high, because the viewpoint change will not affect the existence of its corners.

3 SUSAN corner detection

1. Eliminate falsely detected corners

First find the center of gravity of the USAN, and then calculate the distance between the center of gravity and the center of the template. When the USAN can just indicate a correct corner point, the center of gravity and the center of the template will not be too close, and if it is a thin line, The center of gravity will be very close to the center of the template (as shown in a and b above). Another operation is to enhance the proximity continuity of the USAN. In real images, it is very easy for small noise points to appear, and these noise points may be distributed within the USAN. Therefore, for all pixels in the template, if they are located on the straight line connecting the center of gravity and the center of the template, they should be regarded as part of the USAN.

2. Non-maximum suppression

One of the great advantages of the SUSAN corner detection method over the derivative-based detection method is that it will not be difficult to distinguish the corner response from the center area in the area adjacent to the center, so the local non-maximum suppression needs to be simple. Just choose the local maximum.

4 SUSAN noise filtering method

The SUSAN noise filtering method mainly preserves the structural information of the image by smoothing only those areas that are similar to the central pixel (that is, USAN). The smoothing of USAN is mainly to find the average value of all pixels without operating on adjacent irrelevant areas.

However, when it comes to the formula for distinguishing similarity, the original formula is not used, but is as follows:
c ( r ⃗ , r ⃗ 0 ) = e − ( I ( r ⃗ ) − I ( r ⃗ 0 ) t ) 2 c \left(\vec{r}, \vec{r}_0\right)=e^{-\left(\frac{I(\vec{r})-I\left(\vec{r}_0\right )}{t}\right)^2}c(r ,r 0)=e(tI(r )I(r 0))The final formula of the 2
filter is:
J ( x , y ) = ( ∑ i ≠ 0 , j ≠ 0 I ( x + i , y + j ) ∗ \begin{aligned} & J(x, y)=\left (\sum_{i \neq 0, j \neq 0} I(x+i, y+j) *\right. \\ & \end{aligned}J(x,y)= i=0,j=0I(x+i,y+j)

Chapter 5 * Emerging Edge Detection Algorithms

1Wavelet analysis

In 1986, wavelet analysis developed rapidly on the basis of the research work of Y.Meyer, S.Mallat and I.Daubechies, etc., and became a new subject, and it was widely used in signal processing. The theoretical basis of wavelet transform is Fourier transform, which has multi-scale features. For example, when the scale is large, it can filter out noise very well, but it cannot detect edges very accurately; when the scale is small, it can detect many Good edge information, but poor image denoising. Therefore, the results obtained by using multiple scales to detect edges can be combined to make full use of the advantages of various scales to obtain more accurate detection results.

Currently, there are many different wavelet edge detection algorithms. The main difference lies in the different wavelet transformation functions used. Commonly used wavelet functions include: Morlet wavelet, Daubechies wavelet, Harr wavelet, Mexican Hat wavelet, Hermite wavelet, Mallet wavelet, Gaussian function-based Wavelets and wavelets based on B-splines, etc.

2 Fuzzy algorithm

In the mid-1980s, Pal, King and others first proposed a fuzzy algorithm to detect the edges of images, which can effectively extract the target object from the background. The image edge detection steps based on the fuzzy algorithm are as follows: (1) Apply the membership function G to the image, and the resulting image is a "fuzzy membership matrix"; (2) Repeatedly perform nonlinear transformation on the membership matrix, so that The real edges are more obvious, while the pseudo edges become weaker; (3) Calculate the −1 (inverse) matrix of the membership matrix; (4) Use the “max” and “min” operators to extract edges. The disadvantage of this algorithm is that the calculation is complex and some edge information with low gray values ​​is lost.

3 Artificial Neural Network

Using artificial neural networks [36-38] to extract edges has developed into a new research direction in recent years. Its essence is to regard edge detection as a pattern recognition problem to take advantage of its advantages of requiring less input knowledge and being suitable for parallel implementation. . Its idea is very different from the traditional method. It first maps the input image to some kind of neuron network, then inputs prior knowledge such as the original edge map into it, and then trains it until the learning process converges or the user is satisfied. Can be stopped. Using neural network methods to construct function models is mainly based on training. Among various models of neural networks, feedforward neural networks are the most widely used, and the BP algorithm is most commonly used to train feedforward networks. The edge detection algorithm using BP network has some shortcomings, such as slow convergence speed, poor numerical stability, difficult to adjust parameters, and difficult to meet the needs of practical applications. At present, Hopfield neural network algorithm and fuzzy neural network algorithm are used in many aspects.

Neural networks have the following 5 characteristics:

(1) Self-organization;

(2) Self-learning;

(3) Lenovo storage function;

(4) Tell the ability to seek optimal solutions;

(5) Adaptability, etc.

The characteristics of the above neural network determine its usability for detecting image edges.

Related code actual

Chapter 6 related code

1. Implementation of header file imgFeat.h

#include "imgFeat.h"

void feat::extBlobFeat(Mat& imgSrc, vector<SBlob>& blobs)
{
	double dSigmaStart = 2;
	double dSigmaEnd = 15;
	double dSigmaStep = 1;

	Mat image;
	cvtColor(imgSrc, image, cv::COLOR_BGR2RGB);
	image.convertTo(image, CV_64F);

	vector<double> ivecSigmaArray;
	double dInitSigma = dSigmaStart;
	while (dInitSigma <= dSigmaEnd)
	{
		ivecSigmaArray.push_back(dInitSigma);
		dInitSigma += dSigmaStep;
	}
	int iSigmaNb = ivecSigmaArray.size();

	vector<Mat> matVecLOG;
	
	for (size_t i = 0; i != iSigmaNb; i++)
	{
		double iSigma = ivecSigmaArray[i]; 
		
		Size kSize(6 * iSigma + 1, 6 * iSigma + 1);
		Mat HOGKernel = getHOGKernel(kSize, iSigma);
		Mat imgLog;
		
		filter2D(image, imgLog, -1, HOGKernel); // why imgLog must be an empty mat ?
		imgLog = imgLog * iSigma *iSigma;

		matVecLOG.push_back(imgLog);
	}

	vector<SBlob> allBlobs;
	for (size_t k = 1; k != matVecLOG.size() - 1 ;k++)
	{
		Mat topLev = matVecLOG[k + 1];
		Mat medLev = matVecLOG[k];
		Mat botLev = matVecLOG[k - 1];
		for (int i = 1; i < image.rows - 1; i++)
		{
			double* pTopLevPre = topLev.ptr<double>(i - 1);
			double* pTopLevCur = topLev.ptr<double>(i);
			double* pTopLevAft = topLev.ptr<double>(i + 1);

			double* pMedLevPre = medLev.ptr<double>(i - 1);
			double* pMedLevCur = medLev.ptr<double>(i);
			double* pMedLevAft = medLev.ptr<double>(i + 1);

			double* pBotLevPre = botLev.ptr<double>(i - 1);
			double* pBotLevCur = botLev.ptr<double>(i);
			double* pBotLevAft = botLev.ptr<double>(i + 1);

			for (int j = 1; j < image.cols - 1; j++)
			{
				if ((pMedLevCur[j] >= pMedLevCur[j + 1] && pMedLevCur[j] >= pMedLevCur[j -1] &&
				pMedLevCur[j] >= pMedLevPre[j + 1] && pMedLevCur[j] >= pMedLevPre[j -1] && pMedLevCur[j] >= pMedLevPre[j] && 
				pMedLevCur[j] >= pMedLevAft[j + 1] && pMedLevCur[j] >= pMedLevAft[j -1] && pMedLevCur[j] >= pMedLevAft[j] &&
				pMedLevCur[j] >= pTopLevPre[j + 1] && pMedLevCur[j] >= pTopLevPre[j -1] && pMedLevCur[j] >= pTopLevPre[j] &&
				pMedLevCur[j] >= pTopLevCur[j + 1] && pMedLevCur[j] >= pTopLevCur[j -1] && pMedLevCur[j] >= pTopLevCur[j] &&
				pMedLevCur[j] >= pTopLevAft[j + 1] && pMedLevCur[j] >= pTopLevAft[j -1] && pMedLevCur[j] >= pTopLevAft[j] &&
				pMedLevCur[j] >= pBotLevPre[j + 1] && pMedLevCur[j] >= pBotLevPre[j -1] && pMedLevCur[j] >= pBotLevPre[j] &&
				pMedLevCur[j] >= pBotLevCur[j + 1] && pMedLevCur[j] >= pBotLevCur[j -1] && pMedLevCur[j] >= pBotLevCur[j] &&
				pMedLevCur[j] >= pBotLevAft[j + 1] && pMedLevCur[j] >= pBotLevAft[j -1] && pMedLevCur[j] >= pBotLevAft[j] ) || 
				(pMedLevCur[j] < pMedLevCur[j + 1] && pMedLevCur[j] < pMedLevCur[j -1] &&
				pMedLevCur[j] < pMedLevPre[j + 1] && pMedLevCur[j] < pMedLevPre[j -1] && pMedLevCur[j] < pMedLevPre[j] && 
				pMedLevCur[j] < pMedLevAft[j + 1] && pMedLevCur[j] < pMedLevAft[j -1] && pMedLevCur[j] < pMedLevAft[j] &&
				pMedLevCur[j] < pTopLevPre[j + 1] && pMedLevCur[j] < pTopLevPre[j -1] && pMedLevCur[j] < pTopLevPre[j] &&
				pMedLevCur[j] < pTopLevCur[j + 1] && pMedLevCur[j] < pTopLevCur[j -1] && pMedLevCur[j] < pTopLevCur[j] &&
				pMedLevCur[j] < pTopLevAft[j + 1] && pMedLevCur[j] < pTopLevAft[j -1] && pMedLevCur[j] < pTopLevAft[j] &&
				pMedLevCur[j] < pBotLevPre[j + 1] && pMedLevCur[j] < pBotLevPre[j -1] && pMedLevCur[j] < pBotLevPre[j] &&
				pMedLevCur[j] < pBotLevCur[j + 1] && pMedLevCur[j] < pBotLevCur[j -1] && pMedLevCur[j] < pBotLevCur[j] &&
				pMedLevCur[j] < pBotLevAft[j + 1] && pMedLevCur[j] < pBotLevAft[j -1] && pMedLevCur[j] < pBotLevAft[j] ))
				{
					SBlob blob;
					blob.position = Point(j, i);
					blob.sigma = ivecSigmaArray[k];
					blob.value = pMedLevCur[j];
					allBlobs.push_back(blob);
				}
			}
		}
	}

	

	vector<bool> delFlags(allBlobs.size(), true);
	for (size_t i = 0; i != allBlobs.size(); i++)
	{
		if (delFlags[i] == false)
		{
			continue;
		}
		for (size_t j = i; j != allBlobs.size(); j++)
		{
			if (delFlags[j] == false)
			{
				continue;
			}
			double distCent = sqrt((allBlobs[i].position.x - allBlobs[j].position.x) * (allBlobs[i].position.x - allBlobs[j].position.x) + 
			(allBlobs[i].position.y - allBlobs[j].position.y) * (allBlobs[i].position.y - allBlobs[j].position.y));
			if ((allBlobs[i].sigma + allBlobs[j].sigma) / distCent > 2)
			{
				if (allBlobs[i].value >= allBlobs[j].value)
				{
					delFlags[j] = false;
					delFlags[i] = true;
				}
				else
				{
				 	delFlags[i] = false;
				 	delFlags[j] = true;
				}
			}
		}
	}


	for (size_t i = 0; i != allBlobs.size(); i++)
	{
		if (delFlags[i])
		{
			blobs.push_back(allBlobs[i]);
		}
	}

	sort(blobs.begin(), blobs.end(), compareBlob);
	
}
Mat feat::getHOGKernel(Size& ksize, double sigma)
{
	Mat kernel(ksize, CV_64F);
	Point centPoint = Point((ksize.width -1)/2, ((ksize.height -1)/2));
	// first calculate Gaussian
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			double param = -((i - centPoint.y) * (i - centPoint.y) + (j - centPoint.x) * (j - centPoint.x)) / (2*sigma*sigma);
			pData[j] = exp(param);
		}
	}
	double maxValue;
	minMaxLoc(kernel, NULL, &maxValue);
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			if (pData[j] < EPS* maxValue)
			{
				pData[j] = 0;
			}
		}
	}

	double sumKernel = sum(kernel)[0];
	if (sumKernel != 0)
	{
		kernel = kernel / sumKernel;
	}
	// now calculate Laplacian
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			double addition = ((i - centPoint.y) * (i - centPoint.y) + (j - centPoint.x) * (j - centPoint.x) - 2*sigma*sigma)/(sigma*sigma*sigma*sigma);
			pData[j] *= addition;
		}
	}
	// make the filter sum to zero
	sumKernel = sum(kernel)[0];
	kernel -= (sumKernel/(ksize.width  * ksize.height));	

	return kernel;
}

bool feat::compareBlob(const SBlob& lhs, const SBlob& rhs)
{
	return lhs.value > rhs.value;
}



2.sobel algorithm implementation

#include "imgFeat.h"

void feat::extBlobFeat(Mat& imgSrc, vector<SBlob>& blobs)
{
	double dSigmaStart = 2;
	double dSigmaEnd = 15;
	double dSigmaStep = 1;

	Mat image;
	cvtColor(imgSrc, image, cv::COLOR_BGR2RGB);
	image.convertTo(image, CV_64F);

	vector<double> ivecSigmaArray;
	double dInitSigma = dSigmaStart;
	while (dInitSigma <= dSigmaEnd)
	{
		ivecSigmaArray.push_back(dInitSigma);
		dInitSigma += dSigmaStep;
	}
	int iSigmaNb = ivecSigmaArray.size();

	vector<Mat> matVecLOG;
	
	for (size_t i = 0; i != iSigmaNb; i++)
	{
		double iSigma = ivecSigmaArray[i]; 
		
		Size kSize(6 * iSigma + 1, 6 * iSigma + 1);
		Mat HOGKernel = getHOGKernel(kSize, iSigma);
		Mat imgLog;
		
		filter2D(image, imgLog, -1, HOGKernel); // why imgLog must be an empty mat ?
		imgLog = imgLog * iSigma *iSigma;

		matVecLOG.push_back(imgLog);
	}

	vector<SBlob> allBlobs;
	for (size_t k = 1; k != matVecLOG.size() - 1 ;k++)
	{
		Mat topLev = matVecLOG[k + 1];
		Mat medLev = matVecLOG[k];
		Mat botLev = matVecLOG[k - 1];
		for (int i = 1; i < image.rows - 1; i++)
		{
			double* pTopLevPre = topLev.ptr<double>(i - 1);
			double* pTopLevCur = topLev.ptr<double>(i);
			double* pTopLevAft = topLev.ptr<double>(i + 1);

			double* pMedLevPre = medLev.ptr<double>(i - 1);
			double* pMedLevCur = medLev.ptr<double>(i);
			double* pMedLevAft = medLev.ptr<double>(i + 1);

			double* pBotLevPre = botLev.ptr<double>(i - 1);
			double* pBotLevCur = botLev.ptr<double>(i);
			double* pBotLevAft = botLev.ptr<double>(i + 1);

			for (int j = 1; j < image.cols - 1; j++)
			{
				if ((pMedLevCur[j] >= pMedLevCur[j + 1] && pMedLevCur[j] >= pMedLevCur[j -1] &&
				pMedLevCur[j] >= pMedLevPre[j + 1] && pMedLevCur[j] >= pMedLevPre[j -1] && pMedLevCur[j] >= pMedLevPre[j] && 
				pMedLevCur[j] >= pMedLevAft[j + 1] && pMedLevCur[j] >= pMedLevAft[j -1] && pMedLevCur[j] >= pMedLevAft[j] &&
				pMedLevCur[j] >= pTopLevPre[j + 1] && pMedLevCur[j] >= pTopLevPre[j -1] && pMedLevCur[j] >= pTopLevPre[j] &&
				pMedLevCur[j] >= pTopLevCur[j + 1] && pMedLevCur[j] >= pTopLevCur[j -1] && pMedLevCur[j] >= pTopLevCur[j] &&
				pMedLevCur[j] >= pTopLevAft[j + 1] && pMedLevCur[j] >= pTopLevAft[j -1] && pMedLevCur[j] >= pTopLevAft[j] &&
				pMedLevCur[j] >= pBotLevPre[j + 1] && pMedLevCur[j] >= pBotLevPre[j -1] && pMedLevCur[j] >= pBotLevPre[j] &&
				pMedLevCur[j] >= pBotLevCur[j + 1] && pMedLevCur[j] >= pBotLevCur[j -1] && pMedLevCur[j] >= pBotLevCur[j] &&
				pMedLevCur[j] >= pBotLevAft[j + 1] && pMedLevCur[j] >= pBotLevAft[j -1] && pMedLevCur[j] >= pBotLevAft[j] ) || 
				(pMedLevCur[j] < pMedLevCur[j + 1] && pMedLevCur[j] < pMedLevCur[j -1] &&
				pMedLevCur[j] < pMedLevPre[j + 1] && pMedLevCur[j] < pMedLevPre[j -1] && pMedLevCur[j] < pMedLevPre[j] && 
				pMedLevCur[j] < pMedLevAft[j + 1] && pMedLevCur[j] < pMedLevAft[j -1] && pMedLevCur[j] < pMedLevAft[j] &&
				pMedLevCur[j] < pTopLevPre[j + 1] && pMedLevCur[j] < pTopLevPre[j -1] && pMedLevCur[j] < pTopLevPre[j] &&
				pMedLevCur[j] < pTopLevCur[j + 1] && pMedLevCur[j] < pTopLevCur[j -1] && pMedLevCur[j] < pTopLevCur[j] &&
				pMedLevCur[j] < pTopLevAft[j + 1] && pMedLevCur[j] < pTopLevAft[j -1] && pMedLevCur[j] < pTopLevAft[j] &&
				pMedLevCur[j] < pBotLevPre[j + 1] && pMedLevCur[j] < pBotLevPre[j -1] && pMedLevCur[j] < pBotLevPre[j] &&
				pMedLevCur[j] < pBotLevCur[j + 1] && pMedLevCur[j] < pBotLevCur[j -1] && pMedLevCur[j] < pBotLevCur[j] &&
				pMedLevCur[j] < pBotLevAft[j + 1] && pMedLevCur[j] < pBotLevAft[j -1] && pMedLevCur[j] < pBotLevAft[j] ))
				{
					SBlob blob;
					blob.position = Point(j, i);
					blob.sigma = ivecSigmaArray[k];
					blob.value = pMedLevCur[j];
					allBlobs.push_back(blob);
				}
			}
		}
	}

	

	vector<bool> delFlags(allBlobs.size(), true);
	for (size_t i = 0; i != allBlobs.size(); i++)
	{
		if (delFlags[i] == false)
		{
			continue;
		}
		for (size_t j = i; j != allBlobs.size(); j++)
		{
			if (delFlags[j] == false)
			{
				continue;
			}
			double distCent = sqrt((allBlobs[i].position.x - allBlobs[j].position.x) * (allBlobs[i].position.x - allBlobs[j].position.x) + 
			(allBlobs[i].position.y - allBlobs[j].position.y) * (allBlobs[i].position.y - allBlobs[j].position.y));
			if ((allBlobs[i].sigma + allBlobs[j].sigma) / distCent > 2)
			{
				if (allBlobs[i].value >= allBlobs[j].value)
				{
					delFlags[j] = false;
					delFlags[i] = true;
				}
				else
				{
				 	delFlags[i] = false;
				 	delFlags[j] = true;
				}
			}
		}
	}


	for (size_t i = 0; i != allBlobs.size(); i++)
	{
		if (delFlags[i])
		{
			blobs.push_back(allBlobs[i]);
		}
	}

	sort(blobs.begin(), blobs.end(), compareBlob);
	
}
Mat feat::getHOGKernel(Size& ksize, double sigma)
{
	Mat kernel(ksize, CV_64F);
	Point centPoint = Point((ksize.width -1)/2, ((ksize.height -1)/2));
	// first calculate Gaussian
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			double param = -((i - centPoint.y) * (i - centPoint.y) + (j - centPoint.x) * (j - centPoint.x)) / (2*sigma*sigma);
			pData[j] = exp(param);
		}
	}
	double maxValue;
	minMaxLoc(kernel, NULL, &maxValue);
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			if (pData[j] < EPS* maxValue)
			{
				pData[j] = 0;
			}
		}
	}

	double sumKernel = sum(kernel)[0];
	if (sumKernel != 0)
	{
		kernel = kernel / sumKernel;
	}
	// now calculate Laplacian
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			double addition = ((i - centPoint.y) * (i - centPoint.y) + (j - centPoint.x) * (j - centPoint.x) - 2*sigma*sigma)/(sigma*sigma*sigma*sigma);
			pData[j] *= addition;
		}
	}
	// make the filter sum to zero
	sumKernel = sum(kernel)[0];
	kernel -= (sumKernel/(ksize.width  * ksize.height));	

	return kernel;
}

bool feat::compareBlob(const SBlob& lhs, const SBlob& rhs)
{
	return lhs.value > rhs.value;
}



3. Hog implementation

#include "imgFeat.h"

void feat::extBlobFeat(Mat& imgSrc, vector<SBlob>& blobs)
{
	double dSigmaStart = 2;
	double dSigmaEnd = 15;
	double dSigmaStep = 1;

	Mat image;
	cvtColor(imgSrc, image, cv::COLOR_BGR2RGB);
	image.convertTo(image, CV_64F);

	vector<double> ivecSigmaArray;
	double dInitSigma = dSigmaStart;
	while (dInitSigma <= dSigmaEnd)
	{
		ivecSigmaArray.push_back(dInitSigma);
		dInitSigma += dSigmaStep;
	}
	int iSigmaNb = ivecSigmaArray.size();

	vector<Mat> matVecLOG;
	
	for (size_t i = 0; i != iSigmaNb; i++)
	{
		double iSigma = ivecSigmaArray[i]; 
		
		Size kSize(6 * iSigma + 1, 6 * iSigma + 1);
		Mat HOGKernel = getHOGKernel(kSize, iSigma);
		Mat imgLog;
		
		filter2D(image, imgLog, -1, HOGKernel); // why imgLog must be an empty mat ?
		imgLog = imgLog * iSigma *iSigma;

		matVecLOG.push_back(imgLog);
	}

	vector<SBlob> allBlobs;
	for (size_t k = 1; k != matVecLOG.size() - 1 ;k++)
	{
		Mat topLev = matVecLOG[k + 1];
		Mat medLev = matVecLOG[k];
		Mat botLev = matVecLOG[k - 1];
		for (int i = 1; i < image.rows - 1; i++)
		{
			double* pTopLevPre = topLev.ptr<double>(i - 1);
			double* pTopLevCur = topLev.ptr<double>(i);
			double* pTopLevAft = topLev.ptr<double>(i + 1);

			double* pMedLevPre = medLev.ptr<double>(i - 1);
			double* pMedLevCur = medLev.ptr<double>(i);
			double* pMedLevAft = medLev.ptr<double>(i + 1);

			double* pBotLevPre = botLev.ptr<double>(i - 1);
			double* pBotLevCur = botLev.ptr<double>(i);
			double* pBotLevAft = botLev.ptr<double>(i + 1);

			for (int j = 1; j < image.cols - 1; j++)
			{
				if ((pMedLevCur[j] >= pMedLevCur[j + 1] && pMedLevCur[j] >= pMedLevCur[j -1] &&
				pMedLevCur[j] >= pMedLevPre[j + 1] && pMedLevCur[j] >= pMedLevPre[j -1] && pMedLevCur[j] >= pMedLevPre[j] && 
				pMedLevCur[j] >= pMedLevAft[j + 1] && pMedLevCur[j] >= pMedLevAft[j -1] && pMedLevCur[j] >= pMedLevAft[j] &&
				pMedLevCur[j] >= pTopLevPre[j + 1] && pMedLevCur[j] >= pTopLevPre[j -1] && pMedLevCur[j] >= pTopLevPre[j] &&
				pMedLevCur[j] >= pTopLevCur[j + 1] && pMedLevCur[j] >= pTopLevCur[j -1] && pMedLevCur[j] >= pTopLevCur[j] &&
				pMedLevCur[j] >= pTopLevAft[j + 1] && pMedLevCur[j] >= pTopLevAft[j -1] && pMedLevCur[j] >= pTopLevAft[j] &&
				pMedLevCur[j] >= pBotLevPre[j + 1] && pMedLevCur[j] >= pBotLevPre[j -1] && pMedLevCur[j] >= pBotLevPre[j] &&
				pMedLevCur[j] >= pBotLevCur[j + 1] && pMedLevCur[j] >= pBotLevCur[j -1] && pMedLevCur[j] >= pBotLevCur[j] &&
				pMedLevCur[j] >= pBotLevAft[j + 1] && pMedLevCur[j] >= pBotLevAft[j -1] && pMedLevCur[j] >= pBotLevAft[j] ) || 
				(pMedLevCur[j] < pMedLevCur[j + 1] && pMedLevCur[j] < pMedLevCur[j -1] &&
				pMedLevCur[j] < pMedLevPre[j + 1] && pMedLevCur[j] < pMedLevPre[j -1] && pMedLevCur[j] < pMedLevPre[j] && 
				pMedLevCur[j] < pMedLevAft[j + 1] && pMedLevCur[j] < pMedLevAft[j -1] && pMedLevCur[j] < pMedLevAft[j] &&
				pMedLevCur[j] < pTopLevPre[j + 1] && pMedLevCur[j] < pTopLevPre[j -1] && pMedLevCur[j] < pTopLevPre[j] &&
				pMedLevCur[j] < pTopLevCur[j + 1] && pMedLevCur[j] < pTopLevCur[j -1] && pMedLevCur[j] < pTopLevCur[j] &&
				pMedLevCur[j] < pTopLevAft[j + 1] && pMedLevCur[j] < pTopLevAft[j -1] && pMedLevCur[j] < pTopLevAft[j] &&
				pMedLevCur[j] < pBotLevPre[j + 1] && pMedLevCur[j] < pBotLevPre[j -1] && pMedLevCur[j] < pBotLevPre[j] &&
				pMedLevCur[j] < pBotLevCur[j + 1] && pMedLevCur[j] < pBotLevCur[j -1] && pMedLevCur[j] < pBotLevCur[j] &&
				pMedLevCur[j] < pBotLevAft[j + 1] && pMedLevCur[j] < pBotLevAft[j -1] && pMedLevCur[j] < pBotLevAft[j] ))
				{
					SBlob blob;
					blob.position = Point(j, i);
					blob.sigma = ivecSigmaArray[k];
					blob.value = pMedLevCur[j];
					allBlobs.push_back(blob);
				}
			}
		}
	}

	

	vector<bool> delFlags(allBlobs.size(), true);
	for (size_t i = 0; i != allBlobs.size(); i++)
	{
		if (delFlags[i] == false)
		{
			continue;
		}
		for (size_t j = i; j != allBlobs.size(); j++)
		{
			if (delFlags[j] == false)
			{
				continue;
			}
			double distCent = sqrt((allBlobs[i].position.x - allBlobs[j].position.x) * (allBlobs[i].position.x - allBlobs[j].position.x) + 
			(allBlobs[i].position.y - allBlobs[j].position.y) * (allBlobs[i].position.y - allBlobs[j].position.y));
			if ((allBlobs[i].sigma + allBlobs[j].sigma) / distCent > 2)
			{
				if (allBlobs[i].value >= allBlobs[j].value)
				{
					delFlags[j] = false;
					delFlags[i] = true;
				}
				else
				{
				 	delFlags[i] = false;
				 	delFlags[j] = true;
				}
			}
		}
	}


	for (size_t i = 0; i != allBlobs.size(); i++)
	{
		if (delFlags[i])
		{
			blobs.push_back(allBlobs[i]);
		}
	}

	sort(blobs.begin(), blobs.end(), compareBlob);
	
}
Mat feat::getHOGKernel(Size& ksize, double sigma)
{
	Mat kernel(ksize, CV_64F);
	Point centPoint = Point((ksize.width -1)/2, ((ksize.height -1)/2));
	// first calculate Gaussian
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			double param = -((i - centPoint.y) * (i - centPoint.y) + (j - centPoint.x) * (j - centPoint.x)) / (2*sigma*sigma);
			pData[j] = exp(param);
		}
	}
	double maxValue;
	minMaxLoc(kernel, NULL, &maxValue);
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			if (pData[j] < EPS* maxValue)
			{
				pData[j] = 0;
			}
		}
	}

	double sumKernel = sum(kernel)[0];
	if (sumKernel != 0)
	{
		kernel = kernel / sumKernel;
	}
	// now calculate Laplacian
	for (int i=0; i < kernel.rows; i++)
	{
		double* pData = kernel.ptr<double>(i);
		for (int j = 0; j < kernel.cols; j++)
		{
			double addition = ((i - centPoint.y) * (i - centPoint.y) + (j - centPoint.x) * (j - centPoint.x) - 2*sigma*sigma)/(sigma*sigma*sigma*sigma);
			pData[j] *= addition;
		}
	}
	// make the filter sum to zero
	sumKernel = sum(kernel)[0];
	kernel -= (sumKernel/(ksize.width  * ksize.height));	

	return kernel;
}

bool feat::compareBlob(const SBlob& lhs, const SBlob& rhs)
{
	return lhs.value > rhs.value;
}



4. canny implementation

#include "imgFeat.h"
void feat::getCannyEdge(const Mat& imgSrc, Mat& imgDst, double lowThresh, double highThresh, double sigma)
{
    Mat gray;
    if (imgSrc.channels() == 3)
    {
        cvtColor(imgSrc, gray, cv::COLOR_BGR2GRAY);
    }
    else
    {
        gray = imgSrc.clone();
    }
    gray.convertTo(gray, CV_64F);
    gray = gray / 255;
    
    double gaussianDieOff = .0001;
    double percentOfPixelsNotEdges = .7; // Used for selecting thresholds
    double thresholdRatio = .4;   // Low thresh is this fraction of the high

    int possibleWidth = 30;
    double ssq = sigma * sigma;
    for (int i = 1; i <= possibleWidth; i++)
    {
        if (exp(-(i * i) / (2* ssq)) < gaussianDieOff)
        {
            possibleWidth = i - 1;
            break;
        }
    }

    if (possibleWidth == 30)
    {
        possibleWidth = 1; // the user entered a reallly small sigma
    }

    // get the 1D gaussian filter
    int winSz = 2 * possibleWidth + 1;
    Mat gaussKernel1D(1, winSz, CV_64F);
    double* kernelPtr = gaussKernel1D.ptr<double>(0);
    for (int i = 0; i < gaussKernel1D.cols; i++)
    {
        kernelPtr[i] = exp(-(i - possibleWidth) * (i - possibleWidth) / (2 * ssq)) / (2 * CV_PI * ssq);
    }

    
    // get the derectional derivatives of gaussian kernel
    Mat dGaussKernel(winSz, winSz, CV_64F);
    for (int i = 0; i < dGaussKernel.rows; i++)
    {
        double* linePtr = dGaussKernel.ptr<double>(i);
        for (int j = 0; j< dGaussKernel.cols; j++)
        {
            linePtr[j] = - (j - possibleWidth) * exp(-((i - possibleWidth) * (i - possibleWidth) + (j - possibleWidth) * (j - possibleWidth)) / (2 * ssq)) / (CV_PI * ssq);
        }
    }


    /* smooth the image out*/
    Mat imgSmooth;
    filter2D(gray, imgSmooth, -1, gaussKernel1D);
    filter2D(imgSmooth, imgSmooth, -1, gaussKernel1D.t());
    /*apply directional derivatives*/

    Mat imgX, imgY;
    filter2D(imgSmooth, imgX, -1, dGaussKernel);
    filter2D(imgSmooth, imgY, -1, dGaussKernel.t());

    Mat imgMag;
    sqrt(imgX.mul(imgX) + imgY.mul(imgY), imgMag);
    double magMax;
    minMaxLoc(imgMag, NULL, &magMax, NULL, NULL);

    if (magMax > 0 )
    {
        imgMag = imgMag / magMax;
    }

    
    if (lowThresh == -1 || highThresh == -1)
    {
        highThresh = getCannyThresh(imgMag, percentOfPixelsNotEdges);
        lowThresh = thresholdRatio * highThresh;
    }




    Mat imgStrong = Mat::zeros(imgMag.size(), CV_8U);
    Mat imgWeak = Mat::zeros(imgMag.size(), CV_8U);
    
    
    for (int dir = 1; dir <= 4; dir++)
    {
        Mat gradMag1(imgMag.size(), imgMag.type());
        Mat gradMag2(imgMag.size(), imgMag.type());
        Mat idx = Mat::zeros(imgMag.size(), CV_8U);
        if (dir == 1)
        {
            Mat dCof = abs(imgY / imgX);
            idx = (imgY <= 0 & imgX > -imgY) | (imgY >= 0 & imgX < -imgY);
            idx.row(0).setTo(Scalar(0));
            idx.row(idx.rows - 1).setTo(Scalar(0));
            idx.col(0).setTo(Scalar(0));
            idx.col(idx.cols - 1).setTo(Scalar(0));
            for (int i = 1; i < imgMag.rows - 1; i++)
            {
                for (int j = 1; j < imgMag.cols - 1; j++)
                {
                    gradMag1.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i,j + 1) + dCof.at<double>(i,j) * imgMag.at<double>(i - 1,j + 1);
                    gradMag2.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i,j - 1) + dCof.at<double>(i,j) * imgMag.at<double>(i + 1,j - 1);
                }
            }
        }
        else if(dir == 2)
        {
            Mat dCof = abs(imgX / imgY);
            idx = (imgX > 0 & -imgY >= imgX) | (imgX < 0 & -imgY <= imgX);
            for (int i = 1; i < imgMag.rows - 1; i++)
            {
                for (int j = 1; j < imgMag.cols - 1; j++)
                {
                    gradMag1.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i - 1,j) + dCof.at<double>(i,j) * imgMag.at<double>(i - 1,j + 1);
                    gradMag2.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i + 1,j) + dCof.at<double>(i,j) * imgMag.at<double>(i + 1,j - 1);
                }
            }
        }
        else if(dir == 3)
        {
            Mat dCof = abs(imgX / imgY);
            idx = (imgX <= 0 & imgX > imgY) | (imgX >= 0 & imgX < imgY);
            for (int i = 1; i < imgMag.rows - 1; i++)
            {
                for (int j = 1; j < imgMag.cols - 1; j++)
                {
                    gradMag1.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i - 1,j) + dCof.at<double>(i,j) * imgMag.at<double>(i - 1,j - 1);
                    gradMag2.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i + 1,j) + dCof.at<double>(i,j) * imgMag.at<double>(i + 1,j + 1);
                }
            }
        
        }
        else
        {
            Mat dCof = abs(imgY / imgX);
            idx = (imgY <0 & imgX <= imgY) | (imgY > 0 & imgX >= imgY);
            for (int i = 1; i < imgMag.rows - 1; i++)
            {
                for (int j = 1; j < imgMag.cols - 1; j++)
                {
                    gradMag1.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i,j - 1) + dCof.at<double>(i,j) * imgMag.at<double>(i - 1,j - 1);
                    gradMag2.at<double>(i,j) = (1 - dCof.at<double>(i,j)) * imgMag.at<double>(i,j + 1) + dCof.at<double>(i,j) * imgMag.at<double>(i + 1,j + 1);
                }
            }
        }

        Mat idxLocalMax = idx & ((imgMag >= gradMag1) & (imgMag >= gradMag2));


        imgWeak = imgWeak | ((imgMag > lowThresh) & idxLocalMax);
        imgStrong= imgStrong| ((imgMag > highThresh) & imgWeak);

    }

    imgDst = Mat::zeros(imgWeak.size(),imgWeak.type());
    for (int i = 1; i < imgWeak.rows - 1; i++)
    {
        uchar* pWeak = imgWeak.ptr<uchar>(i);
        uchar* pDst = imgDst.ptr<uchar>(i);
        uchar* pStrPre = imgStrong.ptr<uchar>(i - 1);
        uchar* pStrMid = imgStrong.ptr<uchar>(i);
        uchar* pStrAft = imgStrong.ptr<uchar>(i + 1);
        for (int j = 1; j < imgWeak.cols - 1; j++)
        {
            if (!pWeak[j])
            {
                continue;
            }
            if (pStrMid[j])
            {
                pDst[j] = 255;
            }
            if (pStrMid[j-1] || pStrMid[j+1] || pStrPre[j-1] || pStrPre[j] || pStrPre[j+1] || pStrAft[j-1] || pStrAft[j] ||pStrAft[j+1])
            {
                pDst[j] = 255;
            }
        }
    }
}

double feat::getCannyThresh(const Mat& inputArray, double percentage)
{
    double thresh = -1.0;
    // compute the 64-hist of inputArray
    int nBins = 64;
    double minValue, maxValue;
    minMaxLoc(inputArray, &minValue, &maxValue, NULL, NULL);
    double step = (maxValue - minValue) / nBins;

    vector<unsigned> histBin(nBins,0);
    for (int i = 0; i < inputArray.rows; i++)
    {
        const double* pData = inputArray.ptr<double>(i);
        for(int j = 0; j < inputArray.cols; j++)
        {

            int index = (pData[j] - minValue) / step;
            histBin[index]++;
        }
    }
    unsigned cumSum = 0; 
    for (int i = 0; i < nBins; i++)
    {
        cumSum += histBin[i];

        if (cumSum > percentage * inputArray.rows * inputArray.cols)
        {
            thresh = (i + 1) / 64.0;
            break;
        }
    }
    return thresh;
    
}


main function

#include "imgFeat.h"
int main(int argc, char** argv)
{
	Mat image = imread("test.jpg");
	Mat cornerMap;
	feat::getSobelEdge(image ,cornerMap);
	feat::drawCornerOnImage(image, cornerMap);
	cv::namedWindow("corners");
	cv::imshow("corners", image);
	cv::waitKey();
	return 0;
}

Chapter 7 Appendix

1. Acknowledgments and citations

www.gwylab.com

https://github.com/ronnyyoung/ImageFeatures

Guess you like

Origin blog.csdn.net/weixin_42917352/article/details/128488918