Matrix Multiplication Complexity Analysis

A background

In many machine learning or data mining papers, they more or less involve algorithm complexity analysis. Thinking further, how did you get it?

For a long time, I also felt more puzzled. When reading the paper, when it comes to this part of the content, I will skip the algorithm complexity analysis.

One is because it burns more brains. Although it is known that complexity analysis is an overview of the algorithm as a whole, it is used to compare the quality of the algorithm (this shows that it is important).

The second is that the basis of algorithm analysis is relatively weak (personal subjectively also do not want to).

Algorithm complexity is also more or less dabbled in the course of "Data Structure", saying that I don’t know I’m deceiving myself. Some simple examples will still be analyzed, but when it comes to complex target equations, especially those involving matrices Operation, I don’t know how to analyze it. There is no guide, and I don’t know how to start. With the increase in the number of papers I read, I suddenly got through in the process of slowly exploring, and I knew how to analyze it. The purpose of writing this blog is that readers can understand how to analyze the complexity of matrix multiplication and avoid some detours. The author first introduces the complexity of 2 matrix multiplication, then introduces the complexity of 3 matrix multiplication, and finally introduces the loss equation in several papers, how to use the complexity of matrix multiplication to analyze the quality of the algorithm.

The basics required, a preliminary understanding of the first chapter of "Data Structure", at least know what the algorithm complexity is and how to represent it.

Two matrix multiplication

For matrix A(n*m), B(m*n), where A(n*m) means that A is a matrix of n rows by m columns.

If A*B, then the complexity is O(n*m*n) , that is, O(n^2m). Further thinking, why, direct code explanation:

 for(i=0;i<n;i++){ //A矩阵中的n
        for(j=0;j<m;j++){  //A矩阵中的m 或者B矩阵中的m ,一样的
            for(k=0;k<n;k++){ //B矩阵中的n
                C[i][j]= C[i][j]+A[i][k]*B[k][j]; 
             } 
         } 
     }

A for loop is O(n), here are three for loops , so it is O(n*m*n). (Ps: Personally, it’s better to understand the code by looking at the code. When the next three matrix multiplications, you will be more aware of it)

Two three matrix multiplication

For matrices A(m*n), B(n*m) and C(m*n), where A(m*n) means that A is a matrix of m rows by n columns. (PS: The notation here is different from the previous one, it is mainly convenient to be consistent with the screenshot notation)

  • A*B, then the complexity is O(m*n*m) , that is, O(m^2n).
  • D(m*m)=A*B is calculated with C after the calculation.
  • D*C, then the complexity is O(m*m*n) , that is, O(m^2n).

Equivalent to (A*B)*C. The algorithm complexity of the whole process is O(m^2n). (At first I thought it was O(m^2n)*O(m^2n) = O(m^4n^2), in fact, this understanding is wrong, as described below)

This is consistent with this article , the screenshot is as follows:

For the sake of understanding, the author directly uploads the code to make it clearer.

int A(m*n),
int B(n*m)
int C(m*n)

int D(m*m)
int E(m*n)

//先计算D=A*B
 for(i=0;i<m;i++){ //A矩阵中的m
        for(j=0;j<n;j++){  //A矩阵中的n 或者B矩阵中的n ,一样的
            for(k=0;k<m;k++){ //B矩阵中的m
                D[i][j]= D[i][j]+A[i][k]*B[k][j]; 
             } 
         } 
     }

//在计算E=D*C

 for(i=0;i<m;i++){ //D矩阵中的m
        for(j=0;j<m;j++){  //D矩阵中的m 或者C矩阵中的m ,一样的
            for(k=0;k<n;k++){ //C矩阵中的n
                E[i][j]= E[i][j]+A[i][k]*B[k][j]; 
             } 
         } 
     }

 

Similarly, a for loop is O(n), here the first three for loops , the complexity of matrix multiplication is O(m*n*m)=O(m^2n).

The second time is O(m^2n)

But because it is executed sequentially , the total complexity is addition rather than multiplication.

So O(m^2n)+O(m^2n)

= O(2*m^2n)

= O(m^2n), in the algorithm analysis process, the coefficient can be ignored

 

Analysis of the complexity of the four loss equation

 

Paper 1: Fast Attributed Multiplex Heterogeneous Network Embedding, CIKM , 2020. You can download it yourself if you need it.

The screenshot of the target equation is as follows:

The algorithm complexity is O(n^3*K + n^2*m + n*m*d), and the calculation direction is from left to right. Among them, A(n*n), X(n*m), R(m*d), K is the number of accumulations (equivalent to adding a for loop outside, traversing K times)

O(n^3*K) means calculation . Readers may be confused here. Reading the original text will find that here is \alpha _ialso a square matrix of n*n, so there is n^3. Put a for loop to accumulate outside, which is K *n^3.

O(n^3*K + n^2*m) means calculation

O(n^3*K + n^2*m + n*m*d) means the whole process.

The author later optimized the calculation and reduced the algorithm responsibility from O(n^3*K + n^2*m + n*m*d) to O(K*e*n + n*m*d)

O( n*m*d) means calculation

O(K*e*n) means calculation

Guess you like

Origin blog.csdn.net/qq_39463175/article/details/111818717