This article can be seen as an example of the application of determinants and eigenvalues. But I'll talk about the parts I'm interested in, namely discrete source channel models and diagonalization of circulant matrices.
Markov matrix
This matrix is derived from the definition of probability in probability theory, so each element is actually a non-negative probability value . Markov matrix (Markov matrix), also known as probability matrix (probability matrix), transition probability matrix (transition probability matrix), the convention in English mathematics literature is to use the row vector of probability and the right random matrix of probability , so refer to Wikipedia The definition given above:
A random matrix describes a finite state space SSMarkov chain X t X_ton SXt, if within a time step from iii tojjThe probability of j moving isP r ( j ∣ i ) = P i , j Pr(j|i)=P_{i,j}Pr(j∣i)=Pi,j, the random matrix PPP 'siiline i , linejjThe j column elements are composed ofP i , j P_{i,j}Pi,j给出:
P = [ P 1 , 1 P 1 , 2 … P 1 , j … P 1 , S P 2 , 1 P 2 , 2 … P 2 , j … P 2 , S ⋮ ⋮ ⋱ ⋮ ⋱ ⋮ P i , 1 P i , 2 … P i , j … P i , S ⋮ ⋮ ⋱ ⋮ ⋱ ⋮ P S , 1 P S , 2 … P S , j … P S , S ] . {\displaystyle P=\left[{\begin{matrix}P_{1,1}&P_{1,2}&\dots &P_{1,j}&\dots &P_{1,S}\\P_{2,1}&P_{2,2}&\dots &P_{2,j}&\dots &P_{2,S}\\\vdots &\vdots &\ddots &\vdots &\ddots &\vdots \\P_{i,1}&P_{i,2}&\dots &P_{i,j}&\dots &P_{i,S}\\\vdots &\vdots &\ddots &\vdots &\ddots &\vdots \\P_{S,1}&P_{S,2}&\dots &P_{S,j}&\dots &P_{S,S}\\\end{matrix}}\right].}{\displaystyle} P=⎣ ⎡P1,1P2,1⋮Pi,1⋮PS,1P1,2P2,2⋮Pi,2⋮PS,2……⋱…⋱…P1,jP2,j⋮Pi,j⋮PS,j……⋱…⋱…P1,SP2,S⋮Pi,S⋮PS,S⎦ ⎤.
nature
An obvious conclusion is that the matrix PPP has eigenvalue1 11 (corresponding to the right random matrix, where the sum of each row is 1, if the sum of each column is 1, just take the transpose verification, the same reason), that is,P
x = 1 ⋅ x Px=1\cdot xPx=1⋅
Based on the completeness and additivity of probability, x can be verified, or in other words, since the transfer process is inevitable, the sum of each probability is 1 .
When transferring from one probability space to another, the completeness, non-negativity, listability and additivity of probability still hold. Therefore if A, BA, BA and B are twon × nn\times nn×n 's transfer matrix, its product (probability transfer), power (self-evolution), and arithmetic mean are still transfer matrices, namely
AB , A n , A + B 2 AB, \ \ A^n, \ \ \frac{A +B}{2}AB、 An、 2A+B
Let's think about A n A^nAn当 n → ∞ n \rightarrow \infty n→The scene at ∞ can prove AAThe largest eigenvalue of A is 1, and the absolute values of the remaining eigenvalues are less than 1, soA n A^nAn will actually tend to be byλ = 1 \lambda=1l=The steady state represented by 1 .
discrete source channel model
Friends who have studied "Information Theory" should be familiar with similar forms of transfer matrices, so we use a discrete source channel model as an example. To be precise, a source can be described as a probability matrix. The so-called source distribution means that each transmitted signal occupies a certain transmission possibility. Such as row vector PX P_XPXTo describe a ternary and equally likely source distribution, that is,
PX = [ 1 / 3 1 / 3 1 / 3 ] P_X=\begin{bmatrix} 1/3 & 1/3 & 1/3 \end{bmatrix}PX=[1/31/31/3]
Suppose we face such a channel transfer model
, then the corresponding channel transfer matrix is
P ij = [ 0.8 0.2 0.5 0.5 0.2 0.8 ] row i (probability space of X) → column j (probability space of Y) P_{ij}=\ begin{bmatrix} 0.8 & 0.2 \\ 0.5 & 0.5 \\ 0.2 & 0.8 \\ end{bmatrix} \ \ row i (probability space of X)→column j (probability space of Y)Pij=⎣
⎡0.80.50.20.20.50.8⎦
⎤ row i ( probability space of X )→Column j ( the probability space of Y )
then has a sink probability distribution of
PY = PXP ij = [ 1 / 3 1 / 3 1 / 3 ] [ 0.8 0.2 0.5 0.5 0.2 0.8 ] = [ 0.5 0.5 ] P_Y=P_XP_{ij} =\begin{bmatrix} 1/3 & 1/3 & 1/3 \end{bmatrix}\begin{bmatrix} 0.8 & 0.2 \\ 0.5 & 0.5 \\ 0.2 & 0.8 \end{bmatrix} =\begin{bmatrix } 0.5 & 0.5 \end{bmatrix}PY=PXPij=[1/31/31/3]⎣
⎡0.80.50.20.20.50.8⎦
⎤=[0.50.5]
But in fact we often face this problem, to find the capacityCCC为?
C = max [ H ( Y ) − H ( Y ∣ X ) ] C=\max [H(Y)-H(Y|X)] C=max[H(Y)−H ( Y ∣ X )]
and known
H ( Y ) ≤ H ( 0.5 , 0.5 ) = 1 bit, H ( Y ∣ X ) ≥ H ( 0.2 , 0.8 ) = 0.7219 bits H(Y) \le H(0.5, 0.5)=1 bit, \ \ \ H(Y|X) \ge H(0.2,0.8)=0.7219 bitsH(Y)≤H(0.5,0.5)=1 bit, H ( Y ∣ X ) ≥H(0.2,0.8)=0.7219 bits
And when the two formulas are equal, the corresponding source distribution is
PX = [ 1 / 2 0 1 / 2 ] P_X=\begin{bmatrix} 1/2 & 0 & 1/2 \end{bmatrix}PX=[1/201/2]
So the channel capacity is equal to
C = H ( 0.5 , 0.5 ) − H ( 0.2 , 0.8 ) = 0.2781 bits C = H(0.5,0.5)-H(0.2,0.8)=0.2781 bitsC=H(0.5,0.5)−H(0.2,0.8)=0.2781 bits
This gives people an intuitive feeling that the more the transition probability of the selected channel deviates from 0.5, the better, andthe transmission of equal probability errors is invalid.
Fourier matrix
choose a base
Fourier transform is a classic operation in signal processing. Compared with wavelet transform, the difference lies in the difference of the basis functions . So how to choose the basis function will make people feel comfortable? One answer is to choose a set of orthonormal basis . Thus, when we arrange these basis vectors together to form a transformation matrix, the matrix is a normal matrix, and its inverse is its conjugate transpose .
The old man said such a sentence, "I need infinitely many, because my space is infinite dimensional." That is to say, an infinite-dimensional space should be described by an infinite-dimensional basis. Then for an n-dimensional space, we actually use n orthonormal bases, that is,
v = x 1 q 1 + x 2 q 2 + ⋯ + xiqi + ⋯ + xnqnv=x_1q_1+x_2q_2+\cdots+x_iq_i+\cdots+x_nq_nv=x1q1+x2q2+⋯+xiqi+⋯+xnqn
Among them xi x_ixiRepresents the corresponding base qi q_iqiextension coefficient on . How to get this coefficient? This is to reflect the ingenuity of orthonormality, so that the two sides can be multiplied by the conjugate transpose vector
qi H v = x 1 qi H q 1 + x 2 qi H q 2 + ⋯ + xiqi H qi + ⋯ + xnqi H qn = xi q_i^Hv=x_1q_i^Hq_1+x_2q_i^Hq_2+\cdots+x_iq_i^Hq_i+\cdots+x_nq_i^Hq_n=x_iqiHv=x1qiHq1+x2qiHq2+⋯+xiqiHqi+⋯+xnqiHqn=xi
At this time, 0 will be obtained because they are orthogonal to each other, and 1 will be obtained due to self-standardization, so that it is convenient to obtain the coefficient xi x_ixi。
It is worth noting that orthogonality is very useful . In signal analysis, the familiar Fourier series uses orthogonal trigonometric functions as the basis, namely sin ( nx ) \sin(nx)sin(nx)和 cos ( n x ) \cos(nx) cos ( n x ) , it can be verified that their integrals in a single period are 0, that is, they are orthogonal, and this idea of orthogonality is also the principle of OFDM subcarrier selection.
Fourier matrix
Refer to Wikipedia for the definition of Fourier matrix.
The discrete Fourier transform of N points can be used with an n × mn\times mn×The matrix multiplication of m is expressed, that is,X = F x X = FxX=F x , wherexxx is the original input signal,XXX is the output signal obtained through discrete Fourier transform. a n × nn\times nn×The transformation matrix FFof nF can be defined asF = ( ω ij ) i , j = 0 , … , N − 1 / NF=(\omega ^{ { ij}})_{ { i,j=0,\ldots ,N-1} }/{\sqrt {N}}F=( ohij)i,j=0,…,N−1/N, ie
F = 1 N [ 1 1 1 1 ⋯ 1 1 ω ω 2 ω 3 ⋯ ω N − 1 1 ω 2 ω 4 ω 6 ⋯ ω 2 ( N − 1 ) 1 ω 3 ω 6 ω 9 ⋯ ω 3 ( N − 1 ) ⋮ ⋮ ⋮ ⋮ ⋮ 1 ω N − 1 ω 2 ( N − 1 ) ω 3 ( N − 1 ) ⋯ ω ( N − 1 ) ( N − 1 ) ] {\displaystyle F={\frac { 1}{\sqrt {N}}}{\begin{bmatrix}1&1&1&1&\cdots &1\\1&\omega &\omega ^{2}&\omega ^{3}&\cdots &\omega ^{N-1 }\\1&\omega ^{2}&\omega ^{4}&\omega ^{6}&\cdots &\omega ^{2(N-1)}\\1&\omega ^{3}&\ omega ^{6}&\omega ^{9}&\cdots &\omega ^{3(N-1)}\\\vdots &\vdots &\vdots &\vdots &&\vdots \\1&\omega ^{ N-1}&\omega ^{2(N-1)}&\omega ^{3(N-1)}&\cdots &\omega ^{(N-1)(N-1)}\\\ end{bmatrix}}}F=N1⎣ ⎡1111⋮11ohoh2oh3⋮ohN−11oh2oh4oh6⋮oh2(N−1)1oh3oh6oh9⋮oh3(N−1)⋯⋯⋯⋯⋯1ohN−1oh2(N−1)oh3(N−1)⋮oh(N−1)(N−1)⎦ ⎤
in that www is defined as1 11 ofnnThe principal value of the nth root , that is,
w = e − 2 π i N w=e^{\frac{-2\pi i}{N}}w=eN−2πi
Note that for a column jjj , its column vector is:
qj = [ w α × ( j − 1 ) ] T / N α = 0 , 1 , 2 , ⋯ , N − 1 q_j=\begin{bmatrix} w^{\alpha \ times (j-1)} \end{bmatrix}^T/\sqrt{N} \ \ \ \ \ \alpha=0,1,2,\cdots,N-1qj=[wα×(j−1)]T/N a=0,1,2,⋯,N−1
At this point we take another columnqj + k q_{j+k}qj+k, make the inner product of the two, then
qj + k H qj = 1 N ∑ α = 0 N − 1 w α × ( j − 1 ) w − α × ( j + k − 1 ) = ∑ α = 0 N − 1 w − α k q_{j+k}^Hq_{j}=\frac{1}{N} \sum_{\alpha = 0}^{N-1} w^{\alpha \times (j- 1)}w^{-\alpha \times (j+k-1)}= \sum_{\alpha = 0}^{N-1}w^{-\alpha k}qj+kHqj=N1α = 0∑N−1wα×(j−1)w−α×(j+k−1)=α = 0∑N−1w− α k
whenk = 0 k=0k=When 0 , that is to find the square of its own modulus, it can be seen that the result of the inner product at this time is 1, which verifies the standardization.
And when− k ≠ 0 -k \ne 0−k=0 , we know thatw − kw^{-k}w− k is aNNN times the root of unity, and α \alphaThe coefficient α acts as a rotation on the unit circle. How about− k = 1 -k=1−k=1 as an example, then it can be thought that whenα \alphaα from0 00 times to fetchN − 1 N-1N−1 ,this NNN results will be reconstituted on the unit circle1 11 ofNNN NNs_N times the root of unity. So the question becomes, why does thisNNThe sum of N roots of unity is0 00 ? These three explanations are given on Baidu Encyclopedia:
A basic method is the summation of geometric series , that is, ∑ α = 0 ∞ w α = 1 − w N 1 − w = 0 \sum_{\alpha = 0}^{\infty} w^{\alpha} = \frac{1-w^N}{1-w}=0α = 0∑∞wa=1−w1−wN=0
The second proof is that they form the vertices of a regular polygon on the complex plane, andfrom the symmetry we know that the center of gravity of the polygon is at the origin. The last proof usesVedder's theorem about the roots and coefficients of the equation, from the circle-dividing equationxn − 1 x_{n-1}xn−1term but the coefficient is zero.
No matter which interpretation is adopted, the conclusion is consistent, that is, k ≠ 0 k \ne 0k=When 0 , the result of the inner product is 0, thus verifying the orthogonality. Then,the Fourier matrix is a normal matrix, and an excellent property is produced, that is,
F − 1 = FHF^{-1}=F^HF−1=FH
FFT overhead N log 2 N / 2 Nlog_2N/2No g _2Description of N /2
First from a theoretical point of view. In digital signal processing, we know that there are two typical ideas to complete FFT, one is time selection DIT, and the other is frequency selection DIF. After a sorting, both methods will reduce the amount of calculation by half .
For DIT, for x ( n ) x(n)The parity of the x ( n ) sequence is sorted, and the result isX ( k ) X(k)X ( k ) is the first half and the second half.
ForDIF, it is forx ( n ) x(n)The first half and the second half of the x ( n ) sequence are sorted, and the result is outputX ( k ) X(k)X ( k ) is divided into odd and even.
It is also known that the definition of DFT operation is
X ( k ) = ∑ n = 0 N − 1 x ( n ) w N nk , k = 0 , 1 , ⋯ , N − 1 X(k)=\sum_{n=0} ^{N-1} x(n)w_{N}^{nk}, k = 0,1,\cdots,N-1X(k)=n=0∑N−1x(n)wNnk,k=0,1,⋯,N−1
Among operations, multiplication has the largest operational overhead . for NNFor the DFT of N points, the cost of matrix multiplication is N, there are N points in total, and the cost is estimated to beN 2 N^2N2 , and FFT makes the final conversion intolog 2 N log_2Nlog2For N -level butterfly operations, the multiplication cost of butterfly operations at each stage isN / 2 N/2N /2 (that is, every two coefficients can be shared once, saving half of the calculation), so the total cost isN log 2 N / 2 Nlog_2N/2No g _2N/2。
Now we understand the above process from the perspective of matrix transformation, let's show the process of converting a 64-point FFT to a 32-point FFT:
F 64 = [ I D 32 I − D 32 ] [ F 32 0 0 F 32 ] [ 1 0 0 0 ⋯ 0 0 0 0 1 0 ⋯ 0 0 ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ 0 0 0 0 ⋯ 1 0 0 1 0 0 ⋯ 0 0 0 0 0 1 ⋯ 0 0 ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ 0 0 0 0 ⋯ 0 1 ] odd rows even rows F_{64}=\begin{bmatrix} I &D_{32} \\ I &-D_{32} \end{bmatrix}\begin{bmatrix} F_{32} &0 \\ 0 & F_{32} \end{bmatrix} \begin{bmatrix} \colorbox{yellow}1& 0 &0 &0 &\cdots & 0 & 0\\ 0 & 0 & \colorbox{yellow}1 & 0 & \cdots & 0 & 0\\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & \colorbox{yellow}1 & 0 \\ 0 & \colorbox{aqua}1 & 0 & 0 &\cdots & 0 & 0 \\ 0 & 0 & 0 & \colorbox{aqua}1 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & 0 & \colorbox{aqua}1 \end{bmatrix} \ \ \ \colorbox{yellow}{odd \ rows} \ \ \ \colorbox{aqua}{even \ rows} F64=[IID32−D32][F3200F32]⎣
⎡10⋮000⋮000⋮010⋮001⋮000⋮000⋮001⋮0⋯⋯⋱⋯⋯⋯⋱⋯00⋮100⋮000⋮000⋮1⎦
⎤ odd rows even rows
where D 32 D_{32}D32is a diagonal matrix for correction, and the elements are the familiar correction factors www , we can find that it is the correction coefficient in the butterfly operation, that is,
D 32 = [ w 0 0 0 0 ⋯ 0 w 1 0 0 ⋯ 0 0 w 2 0 ⋯ ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 0 w 31 ] D_{32} = \begin{bmatrix} w^0 & 0 & 0 & 0& \cdots \\ 0 & w^1 & 0 & 0 &\cdots \\ 0 & 0 & w^2 & 0& \cdots \ \ \vdots & \vdots &\vdots & \ddots & \vdots \\ 0 & 0 & 0 & 0 & w^{31} \end{bmatrix}D32=⎣
⎡w000⋮00w10⋮000w2⋮0000⋱0⋯⋯⋯⋮w31⎦
⎤
The rightmost matrix is the switching matrix P 64 P_{64}P64, to realize the function of parity sorting . Observing this process, we found that the increased multiplication overhead is mainly in D 32 D_{32}D32, so the computation of this stage is transformed into two F 32 F_{32}F32The amount of multiplication and a D 32 D_{32}D32of multiplication operations. And so on, F 32 F_{32}F32It can also be further decomposed into F 16 ⋯ F_{16} \cdotsF16⋯ Then the resulting block of basic elements will beF 2 F_2F2, shown as follows
[ I D 32 I − D 32 ] [ I D 16 I D 16 0 0 I D 16 I − D 16 ] ⋯ [ I D 1 I D 1 ⋱ 0 0 I D 1 I − D 1 ] [ F 2 ⋱ 0 0 F 2 ] [ P 2 ⋱ 0 0 P 2 ] ⋯ [ P 32 0 0 P 32 ] [ P 64 ] \begin{bmatrix} I &D_{32} \\ I &-D_{32} \end{bmatrix} \begin{bmatrix} \begin{matrix} I & D_{16} \\ I & D_{16} \end{matrix} & \text{\Large 0} \\ \text{\Large 0} & \begin{matrix} I & D_{16} \\ I & -D_{16} \end{matrix} \end{bmatrix} \cdots \begin{bmatrix} \begin{matrix} I & D_{1} & \\ I & D_{1} & \\ & & \ddots \end{matrix} &\text{\Large 0} \\ \text{\Large 0} & \begin{matrix} I & D_{1} \\ I & -D_{1} \end{matrix} \end{bmatrix} \begin{bmatrix} \begin{matrix} F_2& \\& \ddots \end{matrix} & \text{\Large 0} \\ \text{\Large 0} & \begin{matrix} F_{2} \end{matrix} \end{bmatrix} \begin{bmatrix} \begin{matrix} P_2& \\& \ddots \end{matrix} & \text{\Large 0} \\ \text{\Large 0} & \begin{matrix} P_{2} \end{matrix} \end{bmatrix} \cdots \begin{bmatrix}P_{32} & 0 \\ 0 & P_{32} \end{bmatrix} \begin{bmatrix}P_{64} \end{bmatrix} [IID32−D32]⎣
⎡IID16D1600IID16−D16⎦
⎤⋯⎣
⎡IID1D1⋱00IID1−D1⎦
⎤⎣
⎡F2⋱00F2⎦
⎤⎣
⎡P2⋱00P2⎦
⎤⋯[P3200P32][P64]
It can be seen thatthe correction matrix has a total oflog 2 N \log_2Nlog2N levels, each level hasN/2 N/2N /2 multiplication operations. Therefore, the final operation cost isN log 2 N / 2 Nlog_2N/2No g _2N /2 . Let's intuitively feel the reduction of the amount of calculation, withN = 1024 N=1024N=Take 1024 as an example,
N 2 N log 2 N / 2 = 102 4 2 1024 ∗ 5 ≈ 205 \frac{N^2}{Nlog_2N/2}=\frac{1024^2}{1024*5} \approx 205No g _2N/2N2=1024∗510242≈205
It can be seen that FFTprovides the possibility for high performance of Fourier transform processing.
Explanation of Diagonalization of Circular Matrix
The background of this description is the cyclic prefix in OFDM, refer to Wikipedia, first look at the definition
形式为 A = [ c 0 c n − 1 c n − 2 … c 1 c 1 c 0 c n − 1 ⋯ c 2 c 2 c 1 c 0 ⋯ c 3 ⋮ ⋮ ⋮ ⋱ ⋮ c n − 1 c n − 2 c n − 3 … c 0 ] {\displaystyle A={\begin{bmatrix}c_{0}&c_{n-1}&c_{n-2}&\dots &c_{1}\\c_{1}&c_{0}&c_{n-1}& \cdots &c_{2}\\c_{2}&c_{1}&c_{0}&\cdots &c_3 \\\vdots &\vdots& \vdots&\ddots &\vdots \\c_{n-1}&c_{n-2}&c_{n-3}&\dots &c_{0}\end{bmatrix}}} A=⎣ ⎡c0c1c2⋮cn−1cn−1c0c1⋮cn−2cn−2cn−1c0⋮cn−3…⋯⋯⋱…c1c2c3⋮c0⎦ ⎤The matrix of is a circulant matrix.
We focus on this property: the eigenvector matrix of the circulatory matrix is a discrete Fourier transform matrix of the same dimension, so the eigenvalues of the circulatory matrix can be easily calculated by fast Fourier transform, or the circulatory matrix can be calculated by the Fourier matrix Diagonalization , that is:
AF = F Λ ⇒ Λ = FHAF or A = F Λ FH AF=F\Lambda \Rightarrow \Lambda=F^HA F or A = F\Lambda F^{H}A F=F Λ⇒L=FH AForA=FΛFH
Let us briefly prove that there are three main preconditions: one is the definition of DFT, the other is its time-domain cyclic shift property, and the third is symmetry. That is,
condition 1: X ( k ) = ∑ n = 0 N − 1 x ( n ) w N nk , k = 0 , 1 , ⋯ , N − 1 condition 2: x ( ( n + m ) ) NRN ( n ) ⇔ w N − km X ( k ) Condition 3: x ( N − n ) ⇔ X ( N − k ) Condition 1: X(k)=\sum_{n=0}^{N-1} x(n) w_{N}^{nk}, k = 0,1,\cdots,N-1 \\ Condition 2: x((n+m))_N R_N(n) \Leftrightarrow w_N^{-km}X(k ) \\ Condition 3: x(Nn) \Leftrightarrow X(Nk)Condition 1 : X ( k )=n=0∑N−1x(n)wNnk,k=0,1,⋯,N−1Condition 2 : x (( n+m))NRN(n)⇔wN−kmX(k)Condition 3:x(N−n)⇔X(N−k )
Take the second row as an example, we can see that
c 1 w 0 × 1 + c 0 w 1 × 1 + c N − 1 w 2 × 1 + ⋯ + c 2 w N − 1 × 1 = ∑ n = 0 N − 1 c ( ( N − n + 1 ) ) NRN ( n ) wn × 1 = w − 1 × 1 ∑ n = 0 N − 1 c ( ( N − n ) ) NRN ( n ) wn × 1 = C ( N − 1 ) w − 1 × 1 At this time k = 1 c_1w^{0 \times 1} +c_0w^{1\times 1}+c_{N-1}w^{2 \times 1}+\cdots+ c_2w^{N-1\times1}=\sum_{n=0}^{N-1}c((N-n+1))_NR_{N}(n)w^{n \times 1}=w ^{-1 \times 1} \ \sum_{n=0}^{N-1}c((Nn))_NR_{N}(n)w^{n \times 1}=C(N-1) w^{-1 \times 1} \ \ At this time k=1c1w0×1+c0w1×1+cN−1w2×1+⋯+c2wN−1×1=n=0∑N−1c((N−n+1))NRN(n)wn×1=w−1×1 n=0∑N−1c((N−n))NRN(n)wn×1=C(N−1)w− 1 × 1 whenk =1
and so on, we can knowiii is the shiftedmmm, i i i is also kkin the DFT resultk, j j j is equivalent to serial numbernnn 。
∑ j = 0 N − 1 c ( ( N − j + i ) ) N R N ( n ) w j × i = C ( ( N − i ) ) N R N ( n ) w − i 2 i ( k , m ) , j ( n ) = 0 , 1 , 2 , ⋯ , N − 1 \sum_{j=0}^{N-1}c((N-j+i))_N R_{N}(n) w^{j \times i}=C((N-i))_NR_{N}(n) w^{-i^2} \ \ \ \ i(k,m),j(n)=0,1,2,\cdots,N-1 j=0∑N−1c((N−j+i))NRN(n)wj×i=C((N−i))NRN(n)w−i2 i(k, m),j ( n )=0,1,2,⋯,N−1
Write the above form as a matrix, namely
[ c ( ( N − j + i ) ) NRN ( n ) ] [ wi × j ] = [ C ( ( N − i ) ) NRN ( n ) ] [ w − i 2 ] \begin{bmatrix}c((N-j+i))_N R_{N}(n) \end{bmatrix} \begin{bmatrix}w^{i \times j} \end{bmatrix} = \begin {bmatrix}C((Ni))_NR_{N}(n) \end{bmatrix} \begin{bmatrix}w^{- i^2} \end{bmatrix}[c((N−j+i))NRN(n)][wi×j]=[C((N−i))NRN(n)][w−i2]
on both sides and divideN \sqrt{N}NNormalized, it can be written in the form of Fourier matrix
AF = [ C ( 0 ) 0 0 ⋯ 0 0 C ( N − 1 ) 0 ⋯ 0 0 0 C ( N − 2 ) ⋯ 0 ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 0 C ( 1 ) ] FH = Λ FH = HFH AF=\begin{bmatrix}C(0) & 0 & 0 & \cdots & 0 \\ 0 & C(N-1) &0 & \cdots & 0 \\ 0 & 0 & C(N-2) & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 &0 &C(1) \end{bmatrix} F^{H}=\Lambda F^H=HF^HA F=⎣
⎡C(0)00⋮00C(N−1)0⋮000C(N−2)⋮0⋯⋯⋯⋱0000⋮C(1)⎦
⎤FH=ΛFH=HFH
From this proof process, we can also see thattheiiLine i input signal sequencec ( ( N − j + i ) ) NRN ( n ) c((N-j+i))_NR_{N}(n)c((N−j+i))NRN( n ) , the eigenvalue of the corresponding output isC ( ( N − i ) ) NRN ( n ) C((Ni))_NR_{N}(n)C((N−i))NRN( n ) . The purpose of this is toconvert the convolution operation corresponding to the channel response into an FFT operation, and the FFT has high-performance operation efficiency, or in other words, the eigenvalue at this time corresponds to the channel response. It can be seen that the advancement of digital signal processing technology has laid the foundation for the advancement of communication technology, and all of this requires mathematics.
Note that if the circulant matrix is written in the following form
A = [ c 0 c 1 c 2 … cn − 1 cn − 1 c 0 c 1 ⋯ cn − 2 cn − 2 cn − 1 c 0 ⋯ cn − 3 ⋮ ⋮ ⋮ ⋱ ⋮ c 1 c 2 c 3 … c 0 ] {\displaystyle A={\begin{bmatrix}c_{0}&c_{1}&c_{2}&\dots &c_{n-1}\\c_{n-1}&c_ {0}&c_{1}& \cdots &c_{n-2}\\c_{n-2}&c_{n-1}&c_{0}&\cdots &c_{n-3} \\\vdots &\vdots& \vdots&\ddots &\vdots \\c_{1}&c_{2}&c_{3}&\dots &c_{0}\end{bmatrix}}}A=⎣
⎡c0cn−1cn−2⋮c1c1c0cn−1⋮c2c2c1c0⋮c3…⋯⋯⋱…cn−1cn−2cn−3⋮c0⎦
⎤
Then the eigenvalue obtained is C ( i ) C(i)C ( i ) , which is also the example matrix given in the book "Toeplitz and Circulant Matrices: A review". The book gives a general proof of the diagonalization of circulant matrices (Chapter 3), the difference is that I talk about it more simply by combining the properties of DFT here.