OpenCV中的离散傅里叶变换的解读

关于傅里叶变换及其意义请参照：https://blog.csdn.net/guyuealian/article/details/72817527?locationNum=9&fps=1点击打开链接

读完上面链接中的文章，可以知道在频域处理图像的频率信息简单了不少。在频谱中频率对应的其实是多个正弦波叠加的整幅值，基于这一点想要去除图像中的频率只要过滤掉这个频率对应的幅值就好了。诸如滤波器就是通过设定相应的截止频率来进行滤波的。

dft函数：

void cv::dft	(	InputArray	src,
		OutputArray	dst,
		int	flags = `0`,
		int	nonzeroRows = `0`
	)

Performs a forward or inverse Discrete Fourier transform of a 1D or 2D floating-point array.

对1D或2D浮点数组执行正向或反向离散傅里叶变换。

Parameters

src	input array that could be real or complex. 输入数组：实数数组或者复数数组
dst	output array whose size and type depends on the flags . 输出数组：尺寸和类型有flag决定
flags	transformation flags, representing a combination of the cv::DftFlags 变换标识符
nonzeroRows	when the parameter is not zero, the function assumes that only the first nonzeroRows rows of the input array (DFT_INVERSE is not set) or only the first nonzeroRows of the output array (DFT_INVERSE is set) contain non-zeros, thus, the function can handle the rest of the rows more efficiently and save some time; this technique is very useful for calculating array cross-correlation or convolution using DFT.

关于参数nonzeroRows的解读如下：

当nonzeroRows非零时，函数会假定只有输入矩阵的前nonzeroRows行（未设置DFT_INVERSE）是非零行，或者只有输出矩阵的前nonzeroRows（设置了DFT_INVERSE）行是非零行，因此，函数在处理剩余行是可以节省一些时间，这项技术在采用DFT计算矩阵卷积时尤为明显。

所以，如果你的矩阵M不是很特殊的矩阵（连续多行为0）,这个值一般设置为M.rows。

介绍一下常用的flags：

DFT_INVERSE

Python: cv.DFT_INVERSE

performs an inverse 1D or 2D transform instead of the default forward transform.

执行一维或二维的逆变换来代替默认的正向变换。

DFT_SCALE

Python: cv.DFT_SCALE

scales the result: divide it by the number of array elements. Normally, it is combined with DFT_INVERSE.

将结果进行缩放：除以数组的元素个数（乘以1/N），通常与 DFT_INVERSE一起使用。

DFT_ROWS

Python: cv.DFT_ROWS

performs a forward or inverse transform of every individual row of the input matrix; this flag enables you to transform multiple vectors simultaneously and can be used to decrease the overhead (which is sometimes several times larger than the processing itself) to perform 3D and higher-dimensional transformations and so forth.

对输入矩阵的每个单独行执行正向或反向变换; 此标识符可以同时转换多个矢量。

这里需要特别说明的是：傅里叶变换的结果是复数! 也就是说，对于每个原图像值，会有对应的两个图像值（实部和虚部）。除此之外，还面临这一个问题：频域值范围远远超过空间值范围，因此至少要将频域存储在float格式中。因此，我们需要将图像转换成浮点型并增加一个通道来存储复数值。

好了，以上是对傅里叶变换函数dft()的了解。

接下来，看OpenCV文档中的这么一段话：

DFT performance is not a monotonic function of a vector size. Therefore, when you calculate convolution of two arrays or perform the spectral analysis of an array, it usually makes sense to pad the input data with zeros to get a bit larger array that can be transformed much faster than the original one. Arrays whose size is a power-of-two (2, 4, 8, 16, 32, ...) are the fastest to process. Though, the arrays whose size is a product of 2's, 3's, and 5's (for example, 300 = 5*5*3*2*2) are also processed quite efficiently.

这段一大堆主要讲了一个意思：DFT的性能不是矢量大小的单调函数。接下来又说了，当计算两个阵列的卷积或对阵列进行频谱分析时，通常用零填充输入数据以获得一个比原始阵列更快阵列，这个阵列由于填充了零所以会比原来大一点。为什么要这样做？下面进行了解释：OpenCV在处理大小为2的幂（2，4，8，16，32，...）的阵列时速度是最快的，大小为2，3和5的乘积（例如，300 = 5 * 5 * 3 * 2 * 2）的阵列也能很快就进行处理。

这样就清楚了，为了要达到处理速度的要求，而对输入的矩阵进行了零填充，使这个矩阵成大小为2的幂或者2,3,5的乘积。OpenCV提供了一个函数，可以将矩阵的大小转换成合适的尺寸大小。

int cv::getOptimalDFTSize ( int vecsize )

Returns the optimal DFT size for a given vector size.

返回给定矢量大小的最佳DFT大小。

注意：

The function returns a negative number if vecsize is too large (very close to INT_MAX ).

如果vecsize太大（非常接近INT_MAX），该函数会返回一个负数。

Parameters

vecsize vector size.

另一个函数copyMakeBorder()用于对矩阵进行边界填充：

copyMakeBorder()

void cv::copyMakeBorder	(	InputArray	src,
		OutputArray	dst,
		int	top,
		int	bottom,
		int	left,
		int	right,
		int	borderType,
		const Scalar &	value = `Scalar()`
	)

The function copies the source image into the middle of the destination image. The areas to the left, to the right, above and below the copied source image will be filled with extrapolated pixels.

该函数将原图像复制到目标图像的中间，在其上下左右区域将会用额外的像素填充。当src已经在dst中间时，函数不会复制src本身，而只是构造边框。

Parameters

src	Source image. 源图像
dst	Destination image of the same type as src and the size Size(src.cols+left+right, src.rows+top+bottom) . 目标图像
top	上边框
bottom	下边框
left	左边框
right	Parameter specifying how many pixels in each direction from the source image rectangle to extrapolate. For example, top=1, bottom=1, left=1, right=1 mean that 1 pixel-wide border needs to be built. 这里在各个方向上的值得大小n就是在对应的方向上添加n行或者n列指定的像素值。
borderType	Border type. See borderInterpolate for details. 边界的类型，常见为BORDER_CONSTANT。详见borderInterpolate
value	Border value if borderType==BORDER_CONSTANT . 值类型，默认值为0。当Bordertype为BORDER_CONSTANT时，表示边界值。

到此为止，通过getOptimalDFTSize和copyMakeBorder两个函数可以把输入的矩阵变成一个易于快速处理的合理的矩阵了。

下面就是要计算频谱中频率对应的值，也就是幅值。OpenCV给我们提供了计算二维数组幅值的函数magnitude(),具体如下：

magnitude()

void cv::magnitude	(	InputArray	x,
		InputArray	y,
		OutputArray	magnitude
	）

Parameters

x	floating-point array of x-coordinates of the vectors. 实部，浮点型矢量，x坐标
y	floating-point array of y-coordinates of the vectors; it must have the same size as x. 虚部，浮点型矢量，y坐标
magnitude	output array of the same size and type as x.输出的幅值，和x具有相同的尺寸与类型。

幅值计算公式：

在进行了幅值计算后，我们需要把傅里叶变换的结果显示出来。这里面临着一个问题，之前说过傅里叶变换后值得范围特别大，所以幅值的范围也很大以至于不能在屏幕上正常显示，这里采用对数尺度来代替线性尺度，一缩小幅值范围，增强变化的连续性。

这里用到函数log（）：

log() [2/2]

softdouble cv::log ( const softdouble & a )

做到这里，我们的工作差不多要完成了，但是别忘了在之前我们向图像中填充了多余的零像素，现在我们需要把他们去除掉。最简单的方法是把图像的象限进行对调，左上和右下交换，右上和左下交换，这样原来的四个焦点聚集到了中心。至于为什么还要这样做？有这么一个结论：图像的能量聚集在低频段，也就是幅值大的部分，而高频段也就是幅值小的部分能量比较小。所以在低频段图像亮度大。而低频段分别在图像的四个角点上，高频段在中心（以左上角为原点（0,0）的情况下）。我们添加了零像素，不具有什么能量一般聚集在中心部位，所以进行象限交换是一种投机取巧的方法。

众所周知，正弦波函数的值域是【0,1】，但是幅值可能会超过范围，进行最后一步：归一化。

使用normalize()函数：

normalize() [2/2]

void cv::normalize	(	const SparseMat &	src,
		SparseMat &	dst,
		double	alpha,
		int	normType
	)

Parameters

src	input array.
dst	output array of the same size as src .
alpha	norm value to normalize to or the lower range boundary in case of the range normalization. 范围，0到1
normType	normalization type (see cv::NormTypes).一般为NORM_MINMAX

全部结束之后，将结果显示出来。

代码示例：

#include <opencv2\core\core.hpp>
#include <opencv2\imgproc\imgproc.hpp>
#include <opencv2\highgui\highgui.hpp>
#include <iostream>
using namespace cv;

int main()
{
	//以灰度模式读取图像并显示
	Mat srcImage = imread("1.jpg", 0);
	if (!srcImage.data)
	{
		printf("erroe");
		return false;
	}
	imshow("【原始图像】", srcImage);
	//将图像扩展到最佳尺寸，边界处用0填充
	int m = getOptimalDFTSize(srcImage.rows);
	int n = getOptimalDFTSize(srcImage.cols);
	//将添加的像素设置为0
	Mat padded;
	copyMakeBorder(srcImage, padded, 0, m - srcImage.rows, 0, n - srcImage.cols, 
		BORDER_CONSTANT, Scalar::all(0));
	//为傅里叶变换的结果（实部和虚部）分配存储空间，用一个数组去接收
	Mat planes[] = {
		Mat_<float>(padded),
		Mat::zeros(padded.size(), CV_32F)
	};
	//将其合并成一个多通道的数组
	Mat complexI;
	merge(planes, 2, complexI);
	//进行离散傅里叶变换
	dft(complexI, complexI);
	//将复数转化为幅值，先将多通道数组分离
	split(complexI, planes);
	magnitude(planes[0], planes[1], planes[0]);
	Mat magnitudeImage = planes[0];
	//进行对数尺度缩放
	magnitudeImage += Scalar::all(1);
	log(magnitudeImage, magnitudeImage);
	//对频谱进行裁剪，使其为偶数行偶数列
	//让原行数与列数和-2相与，可以得到最大偶数
	magnitudeImage = magnitudeImage(Rect(0, 0, magnitudeImage.cols & -2, magnitudeImage.rows & -2));
	//重新排列频谱的象限，使四个角点拼成中心点
	int cx = magnitudeImage.cols / 2;
	int cy = magnitudeImage.rows / 2;
	Mat q0(magnitudeImage, Rect(0, 0, cx, cy));
	Mat q1(magnitudeImage, Rect(cx, 0, cx, cy));
	Mat q2(magnitudeImage, Rect(0, cy, cx, cy));
	Mat q3(magnitudeImage, Rect(cx, cy, cx, cy));
	//交换象限
	Mat tmp;
	q0.copyTo(tmp);
	q3.copyTo(q0);
	tmp.copyTo(q3);
	q1.copyTo(tmp);
	q2.copyTo(q1);
	tmp.copyTo(q2);
	//归一化，用0到1之间的浮点值将矩阵变换为可视的图像格式
	normalize(magnitudeImage, magnitudeImage, 0, 1, NORM_MINMAX);
	//显示
	imshow("频谱幅值", magnitudeImage);
	waitKey();
	destroyAllWindows();
	return 0;
}

最后，文章中的有误不当之处，还请大家批评指正，共同进步！