1. Hot Analysis
1.1 hotspot functions
1.2 circulating hot
Format Description: hot cycle (execution count function - the total number of layers were executed)
Percentage implemented: performing hot cycle representing the ratio of this function
Function: SVD
svd.L1
(1-500-500-500-499-124750-41666500) svd.L1.1.1.1.1.1
perform percentage: 5.8%
svd.L1.1.1.1.1.2 (1-500-500 -500-499-124750-41666500)
4.9%: the percentage of execution
svd.L1.2.1.1.1.1 (1-500-499-499-499-124750-41541750)
the percentage of execution: 4.7%
svd.L1.2.1.1.1 (1-500-499-499-499-124750-41541750) .2
perform percentage: 3.8%
svd.L2
svd.L2.1.1.1.1 (1-500-499-499-124750-41541750)
execute percentage: 4.7%
svd.L2.1.1.1.2 (1-500-499-499-124750-41541750)
Percentage implemented : 6.7%
svd.L3
svd.L3.1.1.1.1 (1-500-500-499-124750-41541750)
Percentage implemented:. 5%
svd.L3.1.1.1.2 (1-500-500-499-124750-41666500)
Percentage implemented : 6.3%
svd.L4
svd.L4.1.1.1 (1-500-1211-176680-88340000)
performs percentage: 25.8%
svd.L4.1.1.2 (1-500-1211-176680-88340000)
performs percentage: 30.3%
1.3 Hot Code
/* 函数svd */
// svd.L1
for (i = 0; i < n; i++)
{
if (i < m)
{
if (scale)
{
if (i != n - 1)
{
for (j = l; j < n; j++)
{
// svd.L1.1.1.1.1.1
for (s = 0.0, k = i; k < m; k++)
s += a(k,i) * a(k,j);
// svd.L1.1.1.1.1.2
for (k = i; k < m; k++)
a(k,j) += f * a(k,i);
}
}
}
}
if (i < m && i != n - 1)
{
if (scale)
{
if (i != m - 1)
{
for (j = l; j < m; j++)
{
// svd.L1.2.1.1.1.1
for (s = 0.0, k = l; k < n; k++)
s += a(j,k) * a(i,k);
// svd.L1.2.1.1.1.2
for (k = l; k < n; k++)
a(j,k) += s * rv1[k];
}
}
}
}
}
// svd.L2
for (i = n - 1; i >= 0; i--)
{
if (i < n - 1)
{
if (g)
{
for (j = l; j < n; j++)
{
// svd.L2.1.1.1.1
for (s = 0.0, k = l; k < n; k++)
s += a(i,k) * v(k,j);
// svd.L2.1.1.1.2
for (k = l; k < n; k++)
v(k,j) += s * v(k,i);
}
}
}
}
// svd.L3
for (i = n - 1; i >= 0; i--)
{
if (g)
{
if (i != n - 1)
{
for (j = l; j < n; j++)
{
// svd.L3.1.1.1.1
for (s = 0.0, k = l; k < m; k++)
s += a(k,i) * a(k,j);
// svd.L3.1.1.1.2
for (k = i; k < m; k++)
a(k,j) += f * a(k,i);
}
}
}
}
// svd.L4
for (k = n - 1; k >= 0; k--)
{
for (its = 0; its < 30; its++)
{
for (j = l; j <= nm; j++)
{
// svd.L4.1.1.1
for (jj = 0; jj < n; jj++)
{
// P1
v(jj,j) = (x * c + z * s);
// P2
v(jj,i) = (z * c - x * s);
}
// svd.L4.1.1.2
for (jj = 0; jj < m; jj++)
{
// P3
a(jj,j) = (y * c + z * s);
// P4
a(jj,i) = (z * c - y * s);
}
}
}
}
.
2. Functional Analysis
2.1 Label Data
Description: s, v, m, t ( global data) SVD (partial data).
2.2 Analysis of the data flow
in this embodiment only two of the hottest spot cycles: svd.L4.1.1.1, svd.L4.1.1.2 for analysis
svd.L4.1.1.1 (91-97行)
P1 (svd.m1 [] [i1], svd.s1), (svd.m1 [] [i2], svd.s2) -> svd.m1 [ ] [i1]
P2: (svd.m1 [] [i2], svd.s1), (svd.m1 [] [i1], svd.s2) -> svd.m1 [] [i2]
svd.L4.1.1.2 (100-106行)
P3 (svd.m1 [] [i1], svd.s1), (svd.m1 [] [i2], svd.s2) -> svd.m1 [ ] [i1]
P4 (svd.m1 [] [i2], svd.s1), (svd.m1 [] [i1], svd.s2) -> svd.m1 [] [i2]
3. accelerate analysis