python data analysis algorithms and one of the basic concepts

1.1 algorithm

  • What is computer science?

    • First clear that not only is the computer science research on the computer, although the computer has played a major role in the development of science, but it is only a tool, not a tool for the soul of it. Research solutions produced by the process of so-called computer science is actually solve problems and solve problems in production. For example, given a problem, computer scientists goal is to develop an algorithm to deal with the problem, the solution to this problem finally obtained, or the optimal solution. So computer science can also be considered is the study of algorithms. Therefore, we can also feel, is the so-called algorithms to deal with the problem and solving a realization of ideas or ideology.

  • How to understand the algorithm visualization?

    • A blow will be to develop strategies before the war, the purpose is to be able to cut costs in the shortest time at the lowest consumption conditions to obtain the final victory. If an encoder as a battlefield, the programmer is the commander of the battle, you can how your program can and with minimal consumption of resources to obtain the final results of the implementation of it in the shortest? Algorithm is our strategy!

  • significance

    • Universal data structures and algorithms thinking abnormal strong, have been used in any language, they will be our coding career with us for the longest weapon (right-hand man). There are some experienced programmers final fight is the algorithms and data structures.

    • Data structures and algorithms thinking can also help us to expand thinking and coding experience, it allows us to better integrate into every corner of the world fall programming.

  • What is the algorithm analysis?

    • Students will often new to programming procedures and others have written procedures to do than to get in the process of alignment of the two sides will find a program written in a very similar but not the same. Then there will be an interesting phenomenon: two sets of procedures are used to solve the same problem, but the two programs looked so different, so which group the program better?

    Cited Example: a + b + c = 1000 a ^ 2 + b ^ 2 = c ^ 2 (a, b, c are natural numbers), obtaining a, b, c the possible combinations of?

    #法一:
    for a in range(0,1001):
       for b in range(0,1001):
           for c in range(0,1001):
               if a+b+c == 1000 and a**2+b**2 == c**2:
                   print(a,b,c)
    #法二:
    for a in range(0,1001):
       for b in range(0,1001):
           c = 1000-a-b
           if a+b+c == 1000 and a**2+b**2 == c**2:
                   print(a,b,c)
  • How to judge the merits of the program?

    • Consume computer resources and implementation of the results

    • The average time-consuming calculation algorithm executed

    • Time complexity (recommended)

  • time complexity

    • Number / quantization step algorithm executed: Rule Evaluation

    • The most important item: time complexity expressions the most significant items

    • Big O count method of time complexity were represented: O (quantify the expression of the most significant items)

    • Common Time Complexity:

      • O(1) < O(logn) < O(n) < O(nlogn) < O(n^2) < O(n^3) < O(2^n) < O(n!) < O(n^n)

    Example: calculating the time complexity of the following algorithm

    def sumOfN(n):
       theSum = 0
       for i in range(1,n+1):
           theSum += i
       
       return theSum
    print(sumOfN(10))
    # 1+n+1 = n+2 ==> O(n)

    Example Two: calculating the time complexity of the following algorithm

    a=5
    b=6
    c=10
    for i in range(n):
      for j in range(n):
         x = i * i
         y = j * j
         z = i * j
    for k in range(n):
      w = a*k + 45
      v = b*b
    d = 33

    # 3 + n*n*3 + 2n + 1 ==> 3n**2+2n ==> 3n**2 ==> n**2 ==>O(n**2)

1.2 Data Structure

  • Concept: For data (basic data types (int, float, char)) of the organization was known as a data structure. How is a set of data structures solved to save, save what form.

     

    Case: Some students need to store student information (name, score), then these data should be how to organize it? What is the time complexity of the query on a specific student is it? (Three tissues mode)

    # Way: List + dictionary 
    [{ 'name': 'XXX', 'Score': 'XXX' }, { 'name': 'XXX', 'Score': 'XXX' }, { 'name': ' XXX ', ' Score ': ' XXX ' }] # Second way: a list of tuples + [   ( ' name ', ' Score '),   ( ' name ', ' Score '),   ( ' name ', ' Score ' ) ] # three ways: Dictionary + dictionary { 'zhangsan': { 'Score': 'XXX'}, 'Lisi': { 'Score':'xxx'}}
       
       

       
       

       
       











       
       

    Use a different form of organization data, based on the complexity of the time when a query is not the same. Therefore considered algorithm is designed to solve practical problems, the data structure is a vector algorithm to deal with the problem.

Performance Analysis of Data Structure 1.3 python

timeit module

  • The module can be used to test the implementation of a python code length / speed

  • Timer class

    • Such a length / velocity timerit module dedicated to performing a test code, the class prototype

    • 用法 : timeit.Timer (stmt='pass',setup='pass')

      • stmt argument: that the upcoming test code block statements.

      • setup: setting necessary for the operation's code block.

      • timeit function: timeit.Timer.timeit (number = 100000), the function returns the average time statement execution block number of times.

  • Example: instantiate an empty list, and add the data to the range 0-n list. (Four ways)

    from timeit import Timer
    def text01():
       alist = []
       for i in range(1000):
           alist.append(i)
       return alist

    def text02():
       alist = []
       for i in range(1000):
           alist += [i]
       return alist

    def text03():
       alist = [ i for i in range(1000)]
       return alist

    def text04():
       alist = list(range(1000))
       return alist

    if __name__ == '__main__':
       t1 = Timer('text01()',setup="from __main__ import text01")
       t_1 = t1.timeit(1000)
       print(t_1)
       
       t2 = Timer('text02()',setup="from __main__ import text02")
       t_2 = t2.timeit(1000)
       print(t_2)
       
       t3 = Timer('text03()',setup="from __main__ import text03")
       t_3 = t3.timeit(1000)
       print(t_3)
       
       t4 = Timer('text04()',setup="from __main__ import text04")
       t_4 = t4.timeit(1000)
       print(t_4)
       
    # 0.102241317647497
    # 0.09949069216443718
    # 0.0516077816489684
    # 0.019912033670678397

Guess you like

Origin www.cnblogs.com/lilinyuan5474/p/11498036.html