Benchmark Analysis 1: Cortexsuite.svd3

1. Hot Analysis
1.1 hotspot functions
Here Insert Picture Description
1.2 circulating hot
Format Description: hot cycle (execution count function - the total number of layers were executed)
Percentage implemented: performing hot cycle representing the ratio of this function

Function: SVD
svd.L1
(1-500-500-500-499-124750-41666500) svd.L1.1.1.1.1.1
perform percentage: 5.8%
svd.L1.1.1.1.1.2 (1-500-500 -500-499-124750-41666500)
4.9%: the percentage of execution
svd.L1.2.1.1.1.1 (1-500-499-499-499-124750-41541750)
the percentage of execution: 4.7%
svd.L1.2.1.1.1 (1-500-499-499-499-124750-41541750) .2
perform percentage: 3.8%

svd.L2
svd.L2.1.1.1.1 (1-500-499-499-124750-41541750)
execute percentage: 4.7%
svd.L2.1.1.1.2 (1-500-499-499-124750-41541750)
Percentage implemented : 6.7%

svd.L3
svd.L3.1.1.1.1 (1-500-500-499-124750-41541750)
Percentage implemented:. 5%
svd.L3.1.1.1.2 (1-500-500-499-124750-41666500)
Percentage implemented : 6.3%

svd.L4
svd.L4.1.1.1 (1-500-1211-176680-88340000)
performs percentage: 25.8%
svd.L4.1.1.2 (1-500-1211-176680-88340000)
performs percentage: 30.3%

1.3 Hot Code

/*   函数svd   */
// svd.L1
for (i = 0; i < n; i++)
{
	if (i < m)
    {
    	if (scale)
    	{
    		if (i != n - 1)
            {
                 for (j = l; j < n; j++)
                 {
                 	// svd.L1.1.1.1.1.1
                 	for (s = 0.0, k = i; k < m; k++)
                    	s += a(k,i) * a(k,j);
                    // svd.L1.1.1.1.1.2
                 	for (k = i; k < m; k++)
                     	a(k,j) += f * a(k,i);
                }
            }
        }
	}
	if (i < m && i != n - 1)
    {
    	if (scale)
        {
			if (i != m - 1)
            {
				for (j = l; j < m; j++)
                {
                	// svd.L1.2.1.1.1.1
                	for (s = 0.0, k = l; k < n; k++)
                    	s += a(j,k) * a(i,k);
                    // svd.L1.2.1.1.1.2
                	for (k = l; k < n; k++)
                     	a(j,k) += s * rv1[k];
                }
            }
        }
    }
}

// svd.L2
for (i = n - 1; i >= 0; i--)
{
	if (i < n - 1)
    {
        if (g)
        {
        	for (j = l; j < n; j++)
            {
            	// svd.L2.1.1.1.1
            	for (s = 0.0, k = l; k < n; k++)
	            	s += a(i,k) * v(k,j);
	            // svd.L2.1.1.1.2
                for (k = l; k < n; k++)
                    v(k,j) += s * v(k,i);
            }
        }
    }
}

// svd.L3
for (i = n - 1; i >= 0; i--)
{
	if (g)
    {
    	if (i != n - 1)
        {
        	for (j = l; j < n; j++)
            {
            	// svd.L3.1.1.1.1
            	for (s = 0.0, k = l; k < m; k++)
                	s += a(k,i) * a(k,j);
                // svd.L3.1.1.1.2
                for (k = i; k < m; k++)
                	a(k,j) += f * a(k,i);
            }
       }
   }
} 

// svd.L4
for (k = n - 1; k >= 0; k--)
{                            
	for (its = 0; its < 30; its++)
    {
    	for (j = l; j <= nm; j++)
        {
        	// svd.L4.1.1.1
        	for (jj = 0; jj < n; jj++)
            {
                // P1
             	v(jj,j) = (x * c + z * s);
             	// P2
             	v(jj,i) = (z * c - x * s);
            }
            
			// svd.L4.1.1.2
			for (jj = 0; jj < m; jj++)
            {
                // P3
             	a(jj,j) = (y * c + z * s);
             	// P4
            	a(jj,i) = (z * c - y * s);
            }
       }
   }
}

.

2. Functional Analysis
2.1 Label Data
Here Insert Picture Description
Description: s, v, m, t ( global data) SVD (partial data).

2.2 Analysis of the data flow
in this embodiment only two of the hottest spot cycles: svd.L4.1.1.1, svd.L4.1.1.2 for analysis

svd.L4.1.1.1 (91-97行)
P1 (svd.m1 [] [i1], svd.s1), (svd.m1 [] [i2], svd.s2) -> svd.m1 [ ] [i1]
P2: (svd.m1 [] [i2], svd.s1), (svd.m1 [] [i1], svd.s2) -> svd.m1 [] [i2]

svd.L4.1.1.2 (100-106行)
P3 (svd.m1 [] [i1], svd.s1), (svd.m1 [] [i2], svd.s2) -> svd.m1 [ ] [i1]
P4 (svd.m1 [] [i2], svd.s1), (svd.m1 [] [i1], svd.s2) -> svd.m1 [] [i2]

3. accelerate analysis

Published 13 original articles · won praise 0 · Views 175

Guess you like

Origin blog.csdn.net/weixin_42472659/article/details/103926722