Algorithm efficiency metrics

Algorithm efficiency metrics

Note: This series of notes pictures [data structures and algorithms] from the small turtle

As mentioned earlier design algorithm to maximize efficiency, high efficiency here generally refers to the execution time of the algorithm.

Later statistics

Using computer timer compare different algorithms running time of the preparation of a program designed by the test program and data to determine the level of efficiency of the algorithm.

defect:

  • It must be based on good test preparation program algorithm
  • Different test environment is not generally a big difference

Prior analysis estimation method

J before writing computer programs, algorithms based on statistical methods to estimate.

After summary, we found that a high-level language program written to run on a computer, the time consumed depends on the following factors:

  • Algorithmic strategies and programs adopted
  • The quality of the code generated by the compiler
  • Enter the scale of the problem
  • Instruction execution speed of the machine

Thus, aside these computer-related hardware, software factors, the running time of a program is good or bad depends on the size of the input and the problem of the algorithm. (So-called issue size of the input refers to how much input quantities)

For example, the sequence from 1 to 100 and arithmetic calculation:

// 第一种
int i, sum = 0, n = 100;    // 执行1次
for( i = 1; i<= n; i++ )      // 执行n+1次
{
    sum = sum + i;          // 执行n次
}

//  第二种
int sum = 0, n = 100;        // 执行1次
sum = (1+n)*n/2              // 执行1次
    
// 第一种算法执行了1+(n+1)+n=2n+2次
// 第二种算法执行了1+1=2次

If we take the cycle seen as a whole, ignoring the judge's head and tail overhead, then the two algorithms is actually n and gaps 1.

Examples of application of the extension:

int i, j, x=0, sum=0, n=100;
for( i=1; i <= n; i++ )
{
    for( j=1; j <= n; j++ )
    {
        x++;
        sum = sum + x;
    }
}

This example, the loop condition i from 1 to 100, so that every time j 100 cycles, if very seriously study the exact total number of executions, it is very tired.
On the other hand, we study the complexity of the algorithm, it is the focus of an abstract algorithm as the input scale increment, rather than precise positioning needs to be performed many times, because if this is the case, but we have to think back to the compiler optimization problem, then, and then never not then!
So, just for an example of the algorithm, we can decisively determine the need to perform 100² times.

What we do not care about the language used to write programs that do not care about these programs will run on any kind of computer, we only care about the algorithm that it implements.
Thus, excluding those of the loop index increment and loop termination conditions, variable declarations, print the results and other operations. Finally, when analyzing the running time of the program, the most important thing is to be seen as a program independent of the programming language algorithm or series of steps .
When we analyze the running time of an algorithm, it is important to correlate the number of input modes and basic operations.

Asymptotic growth of functions:

Assuming that the input size of two algorithms are n, algorithm A do 2n + 3 operations, you can understand: the first implementation cycle n times, and then execution is completed there is a cycle n times, the last three operations.

B algorithms do 3n + 1 operations, understand the above, do you think which one is faster it?

scale Algorithm A1 (2n + 3) Algorithm A2 (2n) Algorithm B1 (3n + 1) Algorithm B2 (3n)
n=1 5 2 4 3
n=2 7 4 7 6
n=3 9 6 10 9
n=10 23 20 31 30
n=100 203 200 301 300

When n = 1, the algorithm A1 efficient as algorithms B1, when n = 2, both the same efficiency; when n> 2, the algorithm begins superior algorithm A1 B1, and continues as n increases, the algorithm A1 ratio B1 algorithm gradually widened the gap. So on the whole better than algorithm algorithm A1 B1.

A1 and A2 are substantially covered, B1 and B2 are substantially covered.

Growth asymptotic function: given two functions f (n) and g (n), if there exists an integer N, such that for all n> N, f (n) is always greater than the g (n), then, we say that the asymptotic growth of f (n) faster than g (n).
We also found from the comparison just, as n increases, the back of +3 and +1 in fact does not affect the final curve of the algorithm. So, we can ignore these additive constant.

The second test: Algorithm C is 4n + 8, D is an algorithm 2n² + 1

frequency Algorithm C1 (4n + 8) Algorithm C2 (n) Algorithm D1 (2n ^ 2 + 1) Algorithm D2 (n ^ 2)
n=1 12 1 3 1
n=2 16 2 9 4
n=3 20 3 19 9
n=10 48 10 201 100
n=100 408 100 20001 10000
n=1000 4008 1000 2000001 1000000

We observed that even remove the n multiplied by a constant, the results of both did not change, with the growth of the number n of the algorithm C2, or far less than the algorithm D2.
That is, multiplied by a constant highest degree term is not important, it can be ignored.

Third Test: algorithm E is 2n ^ 2 + 3n + 1, the algorithm is F 2n ^ 3 + 3n + 1

frequency Algorithm E1 (2n ^ 2 + 3n + 1) Algorithm E2 (n ^ 2) Algorithm F1 (2n ^ 3 + 3n + 1) Algorithm F2 (n ^ 3)
n=1 6 1 6 1
n=2 15 4 23 8
n=3 28 9 64 27
n=10 231 100 2031 1000
n=100 20301 10000 2000301 1000000

We also found that by observing the highest order terms of the index a large function with the growth of n, the results will become particularly fast growth. In comparison algorithm E and F, the multiplier n highest order terms and the remaining terms can be ignored

Fourth Test: Algorithm G is 2n ^ 2, H algorithm is 3n + 1, I algorithm is 2n ^ + 3n + 1

frequency Algorithm G (2n ^ 2) Algorithm H (3n + 1) Algorithm I (2n ^ 2 + 3n + 1)
n=1 2 4 6
n=2 8 7 15
n=5 50 16 66
n=10 200 31 231
n=100 2000 301 20301
n=1000 2000000 3001 200301
n=10000 200000000 30001 200030001
n=100000 20000000000 300001 20000300001
n=1000000 2000000000000 3000001 2000003000001

Algorithm H could not see, let's look at the time when a small amount of data.

This set of data we see clearly, when the value of n becomes extremely large when, 3n + 1 and has not 2n ^ 2 compared to the results, eventually almost negligible. The algorithm G coincides with the algorithm I basically been.
So we can get a conclusion when judging the efficiency of an algorithm, function constants and other minor items can often be ignored, but should be concerned about the primary entries (maximum term) of the order .

Note: requires a lot of data test algorithms, the better.

The time complexity of the algorithm

Defined algorithm time complexity: when the algorithm analysis, the statement of the total number of executions T (n) is a function of n scale of the problem, and then analyze T (n) with the change of n and determine T (n) of the order of magnitude. The time complexity of the algorithm, the algorithm is a measure of time, denoted by: T (n) = O ( f (n)). N represents a problem which increases with the size, execution time of the algorithm is the same growth rates and f (n) growth, called asymptotic time complexity of the algorithm, referred to as time complexity. Where f (n) is a function of problem size n.

The key needs to know the number of execution time ==

这样用大写O()来体现算法时间复杂度的记法,我们称之为大O记法
一般情况下,随着输入规模n的增大,T(n)增长最慢的算法为最优算法。

分析算法时间复杂度的步骤:

  • 用常数1取代运行时间中的所有加法常数。
  • 在修改后的运行次数函数中,只保留最高阶项。
  • 如果最高阶项存在且不是1,则去除与这个项相乘的常数。
  • 得到的最后结果就是大O阶。

常数阶:

int sum = 0, n = 100;
printf(“常数阶\n”);
printf(“常数阶\n”);
printf(“常数阶\n”);
printf(“常数阶\n”);
printf(“常数阶\n”);
printf(“常数阶\n”);
sum = (1+n)*n/2;

按照概念"T(n)是关于问题规模n的函数",所以记作O(1).

线性阶:

一般含有非嵌套循环涉及线性阶,线性阶就是随着问题规模n的扩大,对应计算次数呈直线增长。

int i , n = 100, sum = 0;
for( i=0; i < n; i++ )
{
    sum = sum + i;
}

上面这段代码,它的循环的时间复杂度为O(n),因为循环体中的代码需要执行n次。

平方阶:

int i, j, n = 100;
for( i=0; i < n; i++ )
{
    for( j=0; j < n; j++ )
    {
        printf(“I love FishC.com\n”);
    }
}

n等于100,也就是说外层循环每执行一次,内层循环就执行100次,那总共程序想要从这两个循环出来,需要执行100*100次,也就是n的平方。所以这段代码的时间复杂度为O(n^2)

如果,循环的次数不一样:

int i, j, n = 100;
for( i=0; i < n; i++ )
{
    for( j=i; j < n; j++ )
    {
        printf(“不一样\n”);
    }
}

由于当i=0时,内循环执行了n次,当i=1时,内循环则执行n-1次……当i=n-1时,内循环执行1次,所以总的执行次数应该是:

n+(n-1)+(n-2)+…+1 = n(n+1)/2 = n^2/2+n/2

忽略最高项n^2的乘数以及其余项,得出O(n^2).

总结得出,循环的时间复杂度等于循环体的复杂度乘以该循环运行的次数

对数阶:

int i = 1, n = 100;
while( i < n )
{
    i = i * 2;
}

由于每次i*2之后,就距离n更近一步,假设有x个2相乘后大于或等于n,则会退出循环。
于是由2^x = n得到x = log₂n,所以这个循环的时间复杂度为O(log₂n)。

函数调用的时间复杂度分析:

int i, j;
for(i=0; i < n; i++) {     // 执行n+1次
function(i);               
}
void function(int count) {
    printf(“%d”, count);   // 执行n次
}
// function函数的时间复杂度是O(1),所以整体的时间复杂度就是循环的次数O(n)
int i, j;
for(i=0; i < n; i++) {     
    function(i);            // 执行n次    
}
void function(int count) {
    int j;
    for(j=count; j < n; j++) {  
        printf(“%d”, j);    // 执行n - count次
    }
}
// count为0,1,2...n-1,函数的执行次数为n,n-1,..1,总次数n*(n+1)/2 =>算法的时间复杂度为O(n^2)
void function(int count) {
    int j;
    for(j=1; j < n; j++) {  
        printf(“%d”, j); 
    }
}
n++;                          // O(1)
function(n);                  // O(n)
for(i=0; i < n; i++) {
    function(i);              // O(n^2)
}
for(i=0; i < n; i++) {
    for(j=i; j < n; j++) {    
        printf(“%d”, j);       // O(n^2)
    }
}

常见的时间复杂度:

例子 时间复杂度 装逼术语
5201314 O(1) 常数阶
3n+4 O(n) 线性阶
3n^2+4n+5 O(n^2) 平方阶
3log(2)n+4 O(logn) 对数阶
2n+3nlog(2)n+14 O(nlogn) nlogn阶
n^3+2n^2+4n+6 O(n^3) 立方阶
2^n O(2^n) 指数阶

常用的时间复杂度所耗费的时间从小到大依次是:

O(1) < O(logn) < (n) < O(nlogn) < O(n^2) < O(n^3) < O(2^n) < O(n!) < O(n^n)

最坏情况与平均情况:

我们查找一个有n个随机数字数组中的某个数字,最好的情况是第一个数字就是,那么算法的时间复杂度为O(1),但也有可能这个数字就在最后一个位置,那么时间复杂度为O(n)。
平均运行时间是期望的运行时间。
最坏运行时间是一种保证。在应用中,这是一种最重要的需求,通常除非特别指定,我们提到的运行时间都是最坏情况的运行时间

算法的空间复杂度

算法的空间复杂度通过计算算法所需的存储空间实现,算法的空间复杂度的计算公式记作:S(n)=O(f(n)),其中,n为问题的规模,f(n)为语句关于n所占存储空间的函数。
通常,我们都是用“时间复杂度”来指运行时间的需求,是用“空间复杂度”指空间需求。
当直接要让我们求“复杂度”时,通常指的是时间复杂度。

Guess you like

Origin www.cnblogs.com/zyyhxbs/p/11443806.html