The latest and better image denoising algorithm (mathematical modeling available)

Image denoising is a very basic and necessary research. Denoising is often carried out before more advanced image processing and is the basis of image processing. Unfortunately, the current denoising algorithm does not have a good solution. In practical applications, it is more to strike a balance between the effect and the computational complexity, which once again verifies the words of my teacher:

All engineering problems are ultimately optimization problems.

Well, without further ado, let's take a look at the better denoising algorithm.

noise model

There are many sources of noise in images, these noises originate from image acquisition, transmission, compression and other aspects. The types of noise are also different, such as salt and pepper noise, Gaussian noise, etc., and there are different processing algorithms for different noises.

For the input image v(x) with noise , its additive noise can be expressed by an equation:

where is the original image without noise. is the set of pixels, which is the additive noise term, representing the effect of noise. is a collection of pixels, that is, the entire image. It can be seen from this formula that the noise is directly superimposed on the original image, and this noise can be salt and pepper noise or Gaussian noise. In theory, if the noise can be accurately obtained, the original image can be recovered by subtracting the noise from the input image. But reality is often skinny, and noise alone is difficult to figure out unless you know exactly how it's being generated.

In engineering, the noise in the image is often approximated by Gaussian noise, where is the variance of the noise, and the larger the noise, the greater the noise. An effective way to remove Gaussian noise is to average images. The result of averaging N identical images will reduce the variance of Gaussian noise to one-Nth of the original. Now the better denoising algorithms are based on this. One idea to design the algorithm.

NL-Means algorithm

The full name of NL-Means is: Non-Local Means, literally translated as non-local average, proposed by Baudes in 2005, the algorithm uses redundant information ubiquitous in natural images to remove noise. Different from the commonly used bilinear filtering, median filtering, etc., which use local information of the image to filter, it uses the entire image for denoising, finds similar areas in the image in units of image blocks, and then finds these areas. Average, can better remove the Gaussian noise in the image. The filtering process of NL-Means can be expressed by the following formula:

In this formula, is a weight that represents the similarity of pixel and pixel in the original image. This weight is greater than 0, and at the same time, the sum of the weights is 1, which is expressed by the formula as follows:

is the neighborhood of pixel . This formula can be understood like this: for each pixel in the image, the result after denoising is equal to the weighted sum of pixels in its neighborhood, and the weighted weight is equal to the similarity of and . This neighborhood is also called the search area. The larger the search area, the greater the chance of finding similar pixels, but at the same time, the amount of computation increases exponentially. In the literature proposing this algorithm, this region is the entire image! The result is that processing a 512x512 image takes at least a few minutes.

There are many ways to measure the similarity of pixels, the most commonly used is to estimate according to the square of the difference between the brightness values ​​of two pixels. But because of the presence of noise, a single pixel is not reliable. The solution to this is to consider their neighborhoods, and two pixels are said to be highly similar only if their neighborhoods are highly similar. The most common way to measure the similarity of two image patches is to calculate the Euclidean distance between them:

Among them: is a normalized factor, which is the sum of all weights. After dividing each weight by the factor, the weights satisfy the condition that the sum is 1. is the filter coefficient, which controls the attenuation of the exponential function to change the weight of the Euclidean distance. and represent the neighborhood of pixel and pixel , this neighborhood is often called the patch neighborhood. The block neighborhood is generally smaller than the search area. is the Gaussian weighted Euclidean distance of the two neighborhoods. where is the standard deviation of the Gaussian kernel. When calculating the Euclidean distance, the weights of pixels at different positions are different. The closer to the center of the block, the greater the weight, and the farther from the center, the smaller the weight, and the weight obeys the Gaussian distribution. In actual calculation, considering the amount of calculation, a uniformly distributed weight is often used.

Having said so much, it's time to illustrate the problem with a picture:

As shown in the figure above, p is the denoised point. Because the neighborhoods of q1 and q2 are similar to p, the weight sum is relatively large, while the weight value of the point q3 with a relatively large neighborhood difference is small. If you use a graph to represent the weights of all points, you get the following weight graphs:

这6组图像中,左边是原图,中心的白色色块代表了像素  块邻域,右边是计算出来的权重  图,权重范围从0(黑色)到1(白色)。这个块邻域在整幅图像中移动,计算图像中其他区域跟这个块的相似度,相似度越高,得到的权重越大。最后将这些相似的像素值根据归一化之后的权重加权求和,得到的就是去噪之后的图像了。

这个算法参数的选择也有讲究,一般而言,考虑到算法复杂度,搜索区域大概取21x21,相似度比较的块的可以取7x7。实际中,常常需要根据噪声来选取合适的参数。当高斯噪声的标准差 越大时,为了使算法鲁棒性更好,需要增大块区域,块区域增加同样也需要增加搜索区域。同时,滤波系数  与  正相关:,当块变大时, 需要适当减小。

NL-Means算法的复杂度跟图像的大小、颜色通道数、相似块的大小和搜索框的大小密切相关,设图像的大小为NN,颜色通道数为,块的大小为kk,搜索框的大小为nn,那么算法复杂度为:。对512512的彩色图像而言,设置k=7,n=21,OpenCV在使用了多线程的情况下,处理一幅图像所需要的时间需要几十秒。虽然有人不断基于这个算法进行改进、提速,但离实时处理还是比较远。

最后来看一下这个算法的去噪效果[3]:

   

BM3D算法

BM3D(Block-matching and 3D filtering,3维块匹配滤波)可以说是当前效果最好的算法之一。该算法的思想跟NL-Means有点类似,也是在图像中寻找相似块的方法进行滤波,但是相对于NL-Means要复杂得多,理解了NL-Means有助于理解BM3D算法。BM3D算法总共有两大步骤,分为基础估计(Step1)和最终估计(Step2):

BM3D算法流程图

在这两大步中,分别又有三小步:相似块分组(Grouping),协同滤波(Collaborative Filtering)和聚合(Aggregation)。上面的算法流程图已经比较好地将这一过程表示出来了,只需要稍加解释。

Stpe1:基础估计

(1) Grouping:有了NL-Means的基础,寻找相似块的过程很容易理解。首先在噪声图像中选择一些 大小的参照块(考虑到算法复杂度,不用每个像素点都选参照块,通常隔3个像素为一个不长选取,复杂度降到1/9),在参照块的周围适当大小()的区域内进行搜索,寻找若干个差异度最小的块,并把这些块整合成一个3维的矩阵,整合的顺序对结果影响不大。同时,参照块自身也要整合进3维矩阵,且差异度为0。寻找相似块这一过程可以用一个公式来表示:

d(P,Q)代表两个块之间的欧式距离。最终整合相似块获得的矩阵就是流程图Step1中左下角的蓝色R矩阵。


(2) Collaborative Filtering:形成若干个三维的矩阵之后,首先将每个三维矩阵中的二维的块(即噪声图中的某个块)进行二维变换,可采用小波变换或DCT变换等,通常采用小波BIOR1.5。二维变换结束后,在矩阵的第三个维度进行一维变换,通常为阿达马变换(Hadamard Transform)。变换完成后对三维矩阵进行硬阈值处理,将小于阈值的系数置0,然后通过在第三维的一维反变换和二维反变换得到处理后的图像块。这一过程同样可以用一个公式来表达:

在这个公式中,二维变换和一维变换用一个 来表示。是一个阈值操作:

是噪声的标准差,代表噪声的强度。

(3) Aggregation:此时,每个二维块都是对去噪图像的估计。这一步分别将这些块融合到原来的位置,每个像素的灰度值通过每个对应位置的块的值加权平均,权重取决于置0的个数和噪声强度。

Step2:最终估计

(1) Grouping:第二步中的聚合过程与第一步类似,不同的是,这次将会得到两个三维数组:噪声图形成的三维矩阵和基础估计结果的三维矩阵。

(2) Collaborative Filtering:两个三维矩阵都进行二维和一维变换,这里的二维变换通常采用DCT变换以得到更好的效果。用维纳滤波(Wiener Filtering)将噪声图形成的三维矩阵进行系数放缩,该系数通过基础估计的三维矩阵的值以及噪声强度得出。这一过程同样可以用一个公式来表达:

在这个公式中,二维变换和一维变换用一个 来表示。是一个维纳滤波的系数:

是噪声的标准差,代表噪声的强度。

(3) Aggregation:与第一步中一样,这里也是将这些块融合到原来的位置,只是此时加权的权重取决于维纳滤波的系数和噪声强度。

经过最终估计之后,BM3D算法已经将原图的噪声显著地去除。可以来看一组结果:

   

该算法的主要运算量还是在相似块的搜索与匹配上,在与NL-Means同样大小的相似块和搜索区域的情况下,BM3D的算法复杂度是要高于NL-Means的,应该大概在NL-Means的3倍左右。梦想着实时处理的同学可以死心了。


算法比较

要比较算法效果,必然离不开评价体系。由于人带有主观因素,每个人的评价可能都不一样,因此有必要用几种客观的评价方法来对结果进行评价。目前,用得比较多的评价方式是MSE(Mean-Squared Error,均方误差)和PSNR(Peak Signal-to-Noise Ratio,峰值信噪比)。

两幅大小的图像和的MSE计算公式如下:

在这个公式里没有表现出像素值范围对结果的影响,同样的均方误差8-bit的图像和12-bit的图像显然没有可比性。因此,又引入了峰值信噪比:

上式中, 是图像像素最大值,对于8-bit的图像而言 =255,PSNR的单位是分贝(dB)。 通常 PSNR 值越高表示品质越好,一般而言,当 PSNR<30dB 时,代表以人的肉眼看起来是不能容忍的范围。因此大部分PSNR值都要>30dB。但PSNR高,并不代表图像质量一定好,有时候还是必须要靠人的肉眼去辅助判断图像的质量才较为正确。

不同PSNR对应的视觉效果

我对上面两种方法获得的结果针对原图计算了PSNR,结果如下:

两个算法的PSNR比较
  NL-Means BM3D
PSNR 32.0913 33.6711

NL-Means和BM3D可以说是目前效果最好的去噪算法,其中BM3D甚至宣称它可以得到迄今为止最高的PSNR。从最终的结果也可以看出来,BM3D的效果确实要好于NL-Means,噪声更少,能够更好地恢复出图像的细节。在效果这一点上BM3D胜。无愧于State-of-the-art这一称号。当然,这里进行测试的样本比较少,可能还不足以完全说明问题。

最后说几句

这两者可以说是目前最有效的图像去噪算法了,但是都不可避免地要面对一个同问题:尽管计算机性能已经成百上千倍地提高,还是远不能满足很多算法的实时计算的需求,这很大程度上限制了这些算法的使用范围:用户无法处理一张照片需要等待长达几分钟的时间,因此,距离真正意义上的实用还是有一段距离。我们只能期待,要是有一天计算机性能不再是问题,又或者,大牛们能够研究出又快又好的算法吧。


参考文献

[1] Buades A, Coll B, Morel J M. A non-local algorithm for image denoising[C]//Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. IEEE, 2005, 2: 60-65.

[2] Buades A, Coll B, Morel J M. Nonlocal image and movie denoising[J]. International journal of computer vision, 2008, 76(2): 123-139.

[3] Antoni Buades, Bartomeu Coll, and Jean-Michel Morel, Non-Local Means Denoising, Image Processing On Line, 1 (2011). http://dx.doi.org/10.5201/ipol.2011.bcm_nlm

[4] Jacques Froment, Parameter-Free Fast Pixelwise Non-Local Means Denoising, Image Processing On Line, 4 (2014), pp. 300–326. http://dx.doi.org/10.5201/ipol.2014.120

[5] Dabov K, Foi A, Katkovnik V, et al. Image denoising by sparse 3-D transform-domain collaborative filtering[J]. Image Processing, IEEE Transactions on, 2007, 16(8): 2080-2095.

[6] http://www.cs.tut.fi/~foi/GCF-BM3D/

[7] Marc Lebrun, An Analysis and Implementation of the BM3D Image Denoising Method, Image Processing On Line, 2 (2012), pp. 175–213. http://dx.doi.org/10.5201/ipol.2012.l-bm3d

[8] Measures of image quality

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326038867&siteId=291194637