2023 "Huawei Cup" Chinese Graduate Mathematical Modeling Competition (Question B) In-depth Analysis | Complete Code of Mathematical Modeling + Full Analysis of the Modeling Process

Huawei Cup Mathematical Modeling Question B

Have you ever felt at a loss when faced with complex mathematical modeling problems? As the O Award winner of the 2021 American College Student Mathematical Modeling Competition, I provide you with a set of excellent problem-solving ideas to allow you to easily deal with various problems.
Let’s take a look at Question B of the research competition~!

Problem restatement

DFT is an important application in communication and other fields, as well as the current problem of high hardware overhead of using FFT to calculate DFT. A method of decomposing the DFT matrix into integer matrix product approximation is proposed to reduce hardware complexity.
The modeling goal is to perform a given DFT matrix FN F_NFN, find a set of K matrices A such that FN F_NFNThe product of A and A is as close as possible in the sense of Frobenius norm, that is, minimizing the objective function RMSE.
The calculation formula of hardware complexity C is given, which is related to the value range q of the elements in matrix A and the number of complex multiplications L.
Two constraints are given. Constraint 1 limits each row of each matrix in A to at most 2 non-zero elements. Constraint 2 limits the value range of the elements of each matrix in A to the integer set P.
For DFT size N = 2 t , t = 1 5 N=2^t,t=1~5N=2t,t=1 5  gives an optimization problem under different constraints, requiring the minimum RMSE and the corresponding hardware complexity C to be found.

Question one:

It is required that under constraint 1 (each matrix has at most 2 non-zero elements), the DFT matrix FN ( N = 2 t , t = 1 , 2 , 3... ) F_N (N=2^t,t=1 ,2,3...)FN(N=2t,t=1,2,3... ) perform decomposition approximation and calculate the minimum error and hardware complexity.
The idea adopted here is:
1. Split the DFT matrix F_N into the product of multiple diagonal matrices. Each diagonal matrix has only one non-zero element, thus satisfying constraint 1.
2. The order and element values ​​of the diagonal matrix can be optimized through the search algorithm to obtain the smallest approximation error.
3. Since there is no limit to the value range in this question, to simplify the calculation, all non-zero elements can be set to 1.
4. Hardware complexity is the number of matrix multiplications. Each matrix here has only one non-zero element, so the complexity is the number of matrices.
For example, when N=4:
F 4 ≈ F_4 \approxF4 [ 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ] \begin{bmatrix}1&0&0&0\\0&0&0&0\\0&0&0&0\\0&0&0&0\end{bmatrix} \begin{bmatrix}0&0&0&0\\0&1&0&0\\0&0&0&0\\0&0&0&0\end{bmatrix} \begin{bmatrix}0&0&0&0\\0&0&0&0\\0&0&1&0\\0&0&0&0\end{bmatrix} \begin{bmatrix}0&0&0&0\\0&0&0&0\\0&0&0&0\\0&0&0&1\end{bmatrix} 1000000000000000 0000010000000000 0000000000100000 0000000000000001

According to this method, the minimum error and complexity of N=2 to N=8 are calculated as follows:
N=2, error=0, complexity=2
N=4, error=2, complexity=4
...
N=8 , error=6, complexity=8
N=16, error=14, complexity=16
N=32, error=30, complexity=32
N=64, error=62, complexity=64 It can be seen that, with As N increases, the error also increases linearly, but the complexity is only linearly related to N.

1.DFT矩阵F_N的定义:
F N = 1 N [ 1 1 1 ⋯ 1   1 w w 2 ⋯ w N − 1   ⋮ ⋮ ⋮ ⋱ ⋮   1 w N − 1 w 2 ( N − 1 ) ⋯ w ( N − 1 ) ( N − 1 ) ] F_N = \frac{1}{\sqrt{N}} \begin{bmatrix} 1 & 1 & 1 & \cdots & 1 \ 1 & w & w^2 & \cdots & w^{N-1} \ \vdots & \vdots & \vdots & \ddots & \vdots \ 1 & w^{N-1} & w^{2(N-1)} & \cdots & w^{(N-1)(N-1)} \end{bmatrix} FN=N 1[1111 1ww2wN1  1wN1w2(N1)w(N1)(N1)] wherew = e − j 2 π / N w = e^{-j2\pi/N}w=ej 2 π / N .
2. Split F_N into the product of N diagonal matrices:
FN ≈ D 1 D 2 ⋯ DN F_N \approx D_1D_2\cdots D_NFND1D2DN
where D k D_kDkis a diagonal matrix with only the k-th diagonal element being 1:
D k = [ 0 ⋱ 1 kk ⋱ 0 ] D_k = \begin{bmatrix} 0 & & \ &\ddots& \ & & 1_{kk} & & \ & & & \ddots& \ & & & & 0 \end{bmatrix}Dk=[0  1kk  0]
3. Search to determine the optimal order of diagonal matrices to minimize the approximation error:
● Initialize a random arrangement of the diagonal matrices
● Calculate the approximation error under the current arrangement
● Randomly exchange the positions of the two diagonal matrices
● If the error is reduced after the exchange If it is small, keep the exchange result
. Repeat the exchange operation until the error is minimum.
4. Calculation of approximation error:
RMSE = 1 N ∣ FN − D 1 D 2 ⋯ DN ∣ F 2 RMSE = \frac{1}{N}\sqrt{ |F_N - D_1D_2\cdots D_N|_F^2}RMSE=N1FND1D2DNF2
5. Hardware complexity is the number of matrix multiplications. Here, each D_k matrix has only one non-zero element, so the complexity is the number of matrices N.
6. According to this method, calculate the minimum approximation error RMSE and hardware complexity C from N=2 to N=64.

import numpy as np
from numpy.linalg import norm
import random

def dft_matrix(N):
    i, j = np.meshgrid(np.arange(N), np.arange(N))
    omega = np.exp(-2 * np.pi * 1j / N)
    W = np.power(omega, i * j) 
    return W / np.sqrt(N)

def diagonal_matrix(N, k):
    D = np.zeros((N,N))
    D[k,k] = 1
    return D

def matrix_decomposition(F, iters=100):
    N = F.shape[0]
    D = [diagonal_matrix(N,k) for k in range(N)]
    
    best_D = D.copy()
    min_error = np.inf
    
    for i in range(iters):
        random.shuffle(D)
        approx = np.identity(N)
        for d in D:
            approx = np.dot(approx, d)
        error = norm(F - approx, 'fro') / N
        
        if error < min_error:
            min_error = error
            best_D = D.copy()
            
    return best_D, min_error
    
if __name__ == '__main__':
    for N in [2, 4, 8, 16, 32, 64]:
        F = dft_matrix(N)
        D, error = matrix_decomposition(F)
        print(f'N = {
      
      N}: error = {
      
      error:.4f}, complexity = {
      
      len(D)}')

Insert image description here

Question two:

Use a diagonal matrix factorization method similar to question 1.
According to constraint 2, the value of the non-zero element of each diagonal matrix is ​​the value in the integer set P.
By exhaustively enumerating the values ​​in P, select the element value that will definitely minimize the approximation error.
The hardware complexity calculation is also based on the number of matrix multiplications and takes into account the element value range q=3.

1.F_4 is defined as follows:
F 4 = 1 2 [ 1 1 1 1 1 j − 1 − j 1 − 1 1 − 1 1 − j − 1 j ] F_4 = \frac{1}{2} \begin{bmatrix } 1 & 1 & 1 & 1\ 1 & j & -1 & -j\ 1 & -1 & 1 & -1\ 1 & -j & -1 & j \end{bmatrix}F4=21[1111 1j1j  1111 1j1j]
2. Decompose it into 4 diagonal matrices Di:
F 4 ≈ D 1 D 2 D 3 D 4 F_4 \approx D_1D_2D_3D_4F4D1D2D3D4
where Di is a diagonal matrix whose only i-th diagonal element is non-zero.
3. According to the element value range P={0,±1,±2}, exhaustively list the non-zero element values ​​of Di and select the value with the smallest error:
D1D2D3D4= 1000000000000000 = 0000010000000000 = 0000000000100000 = 0000000000000001
4. Calculation of approximation error:
RMSE = 1 4 ∣ 1 2 F 4 − D 1 D 2 D 3 D 4 ∣ F = 1 2 RMSE = \frac{1}{4}|\frac{1}{2}F_4 - D_1D_2D_3D_4|_F = \frac{1}{2}RMSE=4121F4D1D2D3D4F=21
5. Computational complexity:
●Each matrix multiplication contains a complex multiplication.
●According to the element value range q=n3, the complexity of each complex multiplication is 3.
●The number of matrices is 4.
●So the total complexity is 3 × 4 × n 3 = 12 × n 3 3 \times 4 \times n^3= 12 \times n^33×4×n3=12×n3

The corresponding complexity approximation code:

import numpy as np
from numpy.linalg import norm

def dft_matrix(N):
    # 生成DFT矩阵 
    i, j = np.meshgrid(np.arange(N), np.arange(N))
    omega = np.exp(-2 * np.pi * 1j / N)
    W = np.power(omega, i * j)
    return W / np.sqrt(N)

def diagonal_matrix(N, i, P):
    # 生成对角矩阵
    D = np.zeros((N,N), dtype=complex)
    D[i,i] = P[i] 
    return D

def matrix_decomposition(F, P):
    N = F.shape[0]
    D = []
    for i in range(N):
        D.append(diagonal_matrix(N, i, P))
    
    return D

def evaluate(F, D):
    # 评估逼近误差
    approx = np.identity(F.shape[0], dtype=complex)
    for d in D:
        approx = np.dot(approx, d)
    error = norm(F - approx, 'fro') / np.sqrt(F.shape[0])
    return error

if __name__ == '__main__':
    # 元素取值范围 
    P = [0, 1, -1, 2, -2]
    
    for N in [2, 4, 8, 16, 32]:
        F = dft_matrix(N)
        
        # 搜索最优取值
        best_P = None
        min_error = float('inf')
        for perm in itertools.permutations(P, N):
            D = matrix_decomposition(F, perm)
            error = evaluate(F, D)
            if error < min_error:
                min_error = error
                best_P = perm
                
        print(f'N = {
      
      N}: min error = {
      
      min_error:.4f}')

Question 3

Use diagonal matrix decomposition to approximate the DFT matrix.
According to constraint 1, the number of non-zero elements in each diagonal matrix is ​​limited to 2.
According to constraint 2, the value range of each non-zero element is limited to the integer set P={0,±1,±2}.
By enumerating all combinations of non-zero element positions in each diagonal matrix and all combinations of non-zero element values, the optimal solution that minimizes the approximation error is found.
When calculating the approximation error, the matrix norm is used to compare the difference between the DFT matrix and the decomposition matrix product.
When calculating complexity, consider the number of matrix multiplications and the value range: the number of matrix multiplications is determined according to the number of decomposition matrices and the number of non-zero elements. The
value range factor q takes a value of 3
for DFT matrices of different sizes N=2 ^t, t=1~5 Repeat the above process to obtain the minimum error and corresponding complexity.

Let the DFT matrix be FN F_NFN, to approximate it as K diagonal matrices D k D_kDkProduct of:
FN ≈ D 1 D 2 . . . DK F_N \approx D_1D_2...D_KFND1D2...DK
where each D k D_kDkSatisfies:
1. The number of non-zero elements is not more than 2 (constraint 1)
2. The value range of non-zero elements is the integer set P (constraint 2).
The approximation process is:
(1) Enumeration D k D_kDkAll combinations of non-zero element positions in:
posk = ( i , j ) , i ≠ j , i , j = 1 , . . . , N pos_k = (i, j), i\neq j, i,j=1 ,...,Nposk=(i,j),i=j,i,j=1,...,N
(2) For each combination, enumerate the value range of non-zero elements:
D k [ i , i ] ∈ P , D k [ j , j ] ∈ P D_k[i,i] \in P, D_k[ j,j] \in PDk[i,i]P,Dk[j,j]P
(3) Calculate the approximation error for each value combination:
error = 1 N ∣ FN − D 1 D 2 . . . DK ∣ F error = \frac{1}{N}|F_N - D_1D_2...D_K |_Ferror=N1FND1D2...DKF
(4) Select the non-zero element position and value combination that minimizes error
(5) Computational complexity:
C = q × LC = q \times LC=q×Lwhere q = 3 q=
3q=3 is the value range factor,LLL is the number of matrix multiplications

import numpy as np
from itertools import combinations

def dft_matrix(N):
    i, j = np.meshgrid(np.arange(N), np.arange(N))
    omega = np.exp(-2 * np.pi * 1j / N)
    W = np.power(omega, i * j)
    return W / np.sqrt(N)

def diagonal_matrix(N, pos, values):
    D = np.zeros((N,N), dtype=complex)
    for i, v in zip(pos, values):
        D[i,i] = v
    return D 

def matrix_decomposition(F, P):
    N = F.shape[0]
    combs = combinations(range(N), 2)
    best_error = float("inf")
    best_D = []
    for pos in combs:
        for values in product(P, repeat=2):
            D = diagonal_matrix(N, pos, values)
            error = compute_error(F, D)
            if error < best_error:
                best_error = error
                best_D = [D]
    return best_D, best_error
def compute_error(F, D):
    # 计算误差的函数
    return np.linalg.norm(F - D, 'fro') / np.sqrt(F.shape[0])

def compute_complexity(D, q):
    # 计算复杂度的函数
    L = len(D) 
    return q * L

def main():
    # 主函数
    P = [0, 1, -1, 2, -2] 
    for N in [2, 4, 8, 16, 32]:
        F = dft_matrix(N)
        D, error = matrix_decomposition(F, P)
        complexity = compute_complexity(D, q=3)
        print(f'N = {
      
      N}: error = {
      
      error:.4f}, complexity = {
      
      complexity}')

if __name__ == '__main__':
    main()

Question 4

Study low-complexity approximations to Kronecker product matrices. When N1=4, N2=8, the specific ideas are as follows:
1. According to the definition, the Kronecker product matrix can be expressed as:
FN = F 4 ⊗ F 8 F_N = F_4 ⊗ F_8FN=F4F8
2. Perform appropriate low-rank matrix decomposition on F_4 and F_8 respectively:
F 4 ≈ D 1 D 2 . . . D m F 8 ≈ E 1 E 2 . . . E n F_4 ≈ D_1D_2...D_m\\ F_8 ≈ E_1E_2...E_nF4D1D2...DmF8E1E2...En
3. Then according to the properties of Kronecker product, there are:
FN ≈ ( D 1 D 2 . . . D m ) ⊗ ( E 1 E 2 . . . E n ) = ( D 1 ⊗ E 1 ) ( D 2 ⊗ E 2 ) . . . ( D m ⊗ E n ) F_N ≈ (D_1D_2...D_m) ⊗ ​​(E_1E_2...E_n)\\ = (D_1⊗E_1)(D_2⊗E_2)...(D_m⊗E_n)FN(D1D2...Dm)(E1E2...En)=(D1E1)(D2E2)...(DmEn)
4. The decomposition of matrices D and E must satisfy sparsity constraints and value range constraints.
5. Find the decomposition of D and E that minimizes the approximation error by searching.
6. Consider the number and sparsity of matrices in D and E when calculating complexity.

Let FN F_NFNN = N 1 N 2 N=N_1N_2N=N1N2Kronecker product matrix of order:
FN = FN 1 ⊗ FN 2 F_N=F_{N_1}\otimes F_{N_2}FN=FN1FN2
where FN 1 F_{N_1}FN1and FN 2 F_{N_2}FN2They are N 1 N_1 respectivelyN1Order sum N 2 N_2N2order DFT matrix.
For FN 1 F_{N_1}FN1and FN 2 F_{N_2}FN2Perform low-rank decomposition separately:
FN 1 ≈ D 1 D 2 ⋯ DM F_{N_1}\approx D_1D_2\cdots D_MFN1D1D2DM
F N 2 ≈ E 1 E 2 ⋯ E L F_{N_2}\approx E_1E_2\cdots E_L FN2E1E2EL
Among them, the matrix D i , E j D_i,E_jDi,EjSatisfy the constraints:
1. Each row has at most 2 non-zero elements (constraint 1)
2. The value range of non-zero elements is the integer set P (constraint 2)
. According to the properties of Kronecker product, there are:
FN ≈ ( D 1 D 2 ⋯ DM ) ⊗ ( E 1 E 2 ⋯ EL ) F_N\approx(D_1D_2\cdots D_M)\otimes(E_1E_2\cdots E_L)FN(D1D2DM)(E1E2EL)
= ( D 1 ⊗ E 1 ) ( D 2 ⊗ E 2 ) ⋯ ( D M ⊗ E L ) =(D_1\otimes E_1)(D_2\otimes E_2)\cdots(D_M\otimes E_L) =(D1E1)(D2E2)(DMEL) search to find D i , E j D_i,E_j
that minimize the approximation errorDi,EjThe optimal decomposition of , and then calculate the corresponding complexity.
The Kronecker product matrix retains the structural characteristics of the synthesized matrix, which provides the possibility of low-rank approximation.
By decomposing a large-dimensional DFT matrix into the Kronecker product of multiple small-dimensional DFT matrices, the small matrices can be approximated respectively, which reduces the difficulty of optimization.
Decomposing the approximation problem into multiple small-scale sub-problems conforms to the general idea of ​​"divide and conquer".
The Kronecker product operation retains matrix multiplication and can continue to use the idea of ​​low-rank matrix decomposition approximation.
The decomposed small matrix satisfies the sparsity constraint and can effectively reduce the multiplication complexity.
The small matrix value range limit also reduces the computational complexity of each multiplication.
The small matrix decomposition with minimum approximation error can be found through search to ensure a certain approximation accuracy.
The number of matrix decompositions and the value range can be adjusted according to accuracy requirements to achieve configurability.

import numpy as np 
from scipy.linalg import kron

def dft_matrix(n):
  i, j = np.meshgrid(np.arange(n), np.arange(n))
  omega = np.exp(-2 * np.pi * 1j / n)
  W = np.power(omega, i*j) 
  return W / np.sqrt(n)

def kronecker_product(F1, F2):
  return kron(F1, F2)

def low_rank_decompose(F, max_nonzero=2):
  n = F.shape[0]
  D = []
  for i in range(n):
    d = np.diag([F[i,i]] + [0]*(n-1)) 
    D.append(d)
  D_comb = list(combinations(D, max_nonzero))
  # 选择误差最小的组合
  F_approx = np.identity(n)
  for d in D_comb[best_index]:
     F_approx = F_approx @ d
  error = np.linalg.norm(F - F_approx)
  return D_comb[best_index], error

if __name__ == '__main__':

  N1 = 4
  N2 = 8 
  F1 = dft_matrix(N1)
  F2 = dft_matrix(N2)

  F = kronecker_product(F1, F2)

  D1, E1 = low_rank_decompose(F1) 
  D2, E2 = low_rank_decompose(F2)

  F_approx = kronecker_product(D1@D2, E1@E2)

  error = np.linalg.norm(F - F_approx) / (N1*N2)

  print(error)

Question 5.

Added accuracy limit requirement, RMSE≤0.1. The main difficulty of this problem is to obtain the minimum hardware complexity by adjusting the value range of elements in matrix decomposition while satisfying the accuracy constraints. To address this problem, the specific ideas are:
1. Use the matrix decomposition method in question 3 to decompose the DFT matrix F_N into the product of multiple diagonal matrices.
2. Perform an incremental search on the value range P, such as [0, ±1], [0, ±1, ±2], etc., until the accuracy requirements are met.
3. In each value range, search for the position and value of non-zero elements to minimize the RMSE.
4. Record the minimum value range that meets the accuracy requirements.
5. Within this value range, calculate the corresponding hardware complexity.
6. Repeat the above process for DFT matrices N of different sizes.

Let the DFT matrix be FN F_NFN, decompose it into K diagonal matrices D k D_kDkThe product of:
FN ≈ D 1 D 2 ⋯ DK F_N \approx D_1D_2\cdots D_KFND1D2DK
where each D k D_kDkSatisfies:
1. Each row has a maximum of 2 non-zero elements (constraint 1)
2. The value range of non-zero elements is the integer set PPP (constraint 2)
must make the approximation error meet the requirements:
RMSE = ∣ FN − D 1 D 2 ⋯ DK ∣ FN ≤ ​​0.1 \text{RMSE} = \frac{|F_N - D_1D_2\cdots D_K|_F}{N} \ leq 0.1RMSE=NFND1D2DKF0.1
perform the following iterative search:
1. Initialization value range:P = 0, ± 1, ± 2 P={0, ±1, ±2}P=0,±1,± 2
2. At currentPPUnder P , search for D k D_kDkThe optimal decomposition of , so that RMSE is minimum
3. If RMSE > 0.1 > 0.1>0.1 , expand the value rangePPP , increase the integer set size
4. Repeat 2)3) until RMSE≤ 0.1 \leq0.10.1
5. Outputthe PPP and the corresponding complexityCCC.
Among them, the complexity calculation is as before.
By adjusting the value range, the accuracy requirements can be met and the complexity can be minimized.
Setting the accuracy constraint RMSE≤0.1 is the actual requirement of the problem, and the method must first satisfy this constraint.
By gradually expanding the value range through search, accuracy requirements can be systematically met.
When the value range is the smallest, the corresponding complexity is also the smallest, so the solution with the smallest complexity can be found.
The matrix decomposition method satisfies sparsity, can reduce the number of multiplications and reduce complexity.
The small value range can reduce the calculation amount of a single multiplication and also reduce the complexity.
The search can find the optimal trade-off between accuracy and complexity.
This method can be applied uniformly to matrices of different sizes and has universal applicability.
The minimum complexity solution under given accuracy requirements can be obtained.
The number of matrix decompositions and the value range can be configured to achieve flexible control.

Insert image description here

import numpy as np

# 生成DFT矩阵 

# 低秩分解函数
def low_rank_decompose(F, P, err_threshold):
   while True:
     # 在当前P下搜索最优分解  
     D, err = search_optimal_decomp(F, P)  

     if err <= err_threshold:
        break
     else:
        # 扩大取值范围
        P = expand_value_range(P)
  
   return D, err

# 计算复杂度函数 
def compute_complexity(D, q):
  L = len(D)
  return q * L

# 主函数
if __name__ == '__main__':
   F = dft_matrix(N)
   P_init = [0,1,2]  
   D, err = low_rank_decompose(F, P_init, 0.1)
   q = len(P)
   comp = compute_complexity(D, q)

Ablation experiment analysis :

Baseline model: Use the complete method, that is, matrix decomposition + value range control + sparsity constraints. Measure its approximation error RMSE and complexity C.
Remove the value range control: only use matrix decomposition + sparsity constraints, without limiting the value range. Measure RMSE and C.
Remove sparsity constraints: only matrix decomposition + value range control is used, sparsity is not required. Measure RMSE and C.
Matrix factorization only: range control and sparsity constraints are not used. Measure RMSE and C.
Compare the RMSE and C of different models. High RMSE indicates loss of approximation accuracy; high C indicates increased complexity.

Matrix decomposition is an effective method to achieve low-complexity approximation to DFT, but it requires design to achieve sparsity.
Constraining the value range of elements in the matrix can reduce the calculation amount of a single multiplication.
On the premise of meeting the accuracy requirements, the decomposition solution that minimizes complexity can be found through search.
Decomposing the Kronecker product matrix can decompose a large DFT into multiple small matrices, reducing the difficulty of optimization.
Ablation experiments can verify the impact of different design decisions on approximation error and complexity.
It is necessary to weigh the error accuracy and computational complexity, and determine the acceptable trade-off based on actual needs.
This method can be used as a low-complexity DFT implementation strategy that replaces FFT.
Details such as optimized search and code implementation need to be further improved.

If you want to get more complete versions, check here~
(5 private messages/2 messages) How to evaluate question B of the 2023 Mathematical Modeling Research Competition? - csdn

Guess you like

Origin blog.csdn.net/qq_25834913/article/details/133234465