Linux Study Notes (2) - Introduction and Examples of Processes and Threads


foreword

This article introduces the concept and comparison of process and thread in detail, both of which are used to improve the execution efficiency of the program.


Tip: The following is the text of this article, and the following cases are for reference

Introduction to Multitasking

  • For example, the parallel download of Baidu network disk, the operating system is a multi-tasking operating system

  • Benefits: Use cpu resources to improve program execution efficiency

Two forms of expression:

  1. Concurrency: Execute multiple tasks alternately within a period of time (the number of tasks > the number of cores of the cpu)
    • Example: Multiple tasks are executed alternately , A0.01s->B0.01s->C0.01s->A0.01s
  2. Parallelism: Execute multiple tasks together within the same period of time (the number of tasks <= the number of cores of the cpu)
    • Example: For a multi-core cpu to handle multitasking, the operating system will arrange a task for each core of the cpu, and multiple cores will execute multiple tasks at the same time

Introduction to the process

The way to achieve multitasking is to use processes

def: Process (Process) is the smallest unit of resource allocation. It is the basic unit for resource allocation and scheduling operation of the operating system . Generally speaking, a running program is a process.

  • For example: WeChat, QQ (task manager can view)

The role of multi-process (a way to achieve multi-tasking): to improve the efficiency of program execution

insert image description here

Multi-process to complete multi-task

  1. Process creation steps

    1. import process package
      • import multiprocessing
    2. Create a process object through the process class
      • process object = multiprocessing.Process()
    3. Start the process to execute the task
      • process object.start()
  2. Create a process object through the process class

    Process object = multiprocessing.Process(target = task name)

    parameter name illustrate
    target The name of the target task to execute, here refers to the function name (method name)
    name Process name, usually not set
    group Process group, currently only None can be used
  3. Process creation and startup code

    # 创建子进程
    coding_process = multiprocessing.Process(target = coding)
    # 创建子进程
    music_process = multiprocessing.Process(target = music)
    # 启动进程
    coding_process.start()
    music_process.start()
    
  4. the case

    import time
    import multiprocessing
    #编写代码
    def coding():
        for i in range(3):
            print("Coding...")
            time.sleep(0.2)
    
    def music():
        for i in range(3):
            print("music...")
            time.sleep(0.2)        
    
    if __name__ == '__main__':
    #     coding()
    #     music()
    # 创建进程对象
        coding_process = multiprocessing.Process(target=coding)
        music_process = multiprocessing.Process(target=music)
        #启动进程
        coding_process.start()
        music_process.start()
    
    #执行结果如下
    music...
    Coding...
    music...
    Coding...
    music...
    Coding...
    
  5. knowledge points

    Keep in mind the order of creation processes

  6. A process executes a task with parameters

    args: tuple form ( note the order of parameters )

    kwargs: dictionary form ( note that the key must be consistent with the formal parameter name )

    import time
    import multiprocessing
    #编写代码
    def coding(num, name):
        
        for i in range(num):
            print(name)
            print("Coding...")
            time.sleep(0.2)
    
    def music(count):
        for i in range(count):
            print("music...")
            time.sleep(0.2)        
    
    if __name__ == '__main__':
    #     coding()
    #     music()
    # 创建进程对象
        coding_process = multiprocessing.Process(target=coding, args = (3,"lalal"))
        music_process = multiprocessing.Process(target=music,kwargs = {
          
          "count" : 2})
        #启动进程
        coding_process.start()
        music_process.start()
    
  7. get process id

    1. Get the current process number

      getpid() method

    2. Get the current parent process number

      getppid() method

    3. example

      import time
      import multiprocessing
      import os
      
      def work():
          #获取当前进程的编号
          print("work进程编号:",os.getpid())
          #获取当前父进程的编号
          print("work进程编号:",os.getppid())
      #编写代码
      
      def coding(num, name):
          print("coding>>%d:"%os.getpid())
          print("coding_father>>%d:"%os.getppid())
          for i in range(num):
              print(name)
              print("Coding...")
              time.sleep(0.2)
      
      def music(count):
          print("music>>%d:"%os.getpid())
          print("music_father>>%d:"%os.getppid())
          for i in range(count):
              print("music...")
              time.sleep(0.2)        
      
      if __name__ == '__main__':
      #     coding()
      #     music()
      # 创建进程对象
      #
          coding_process = multiprocessing.Process(target=coding, args = (3,"lalal"))
          music_process = multiprocessing.Process(target=music,kwargs = {
              
              "count" : 2})
          #启动进程
          coding_process.start() #
          music_process.start() #
      
    4. There are three processes in the example. The program starts to create the main process, and then creates two sub-processes, coding and music. Referring to the execution results, the parent process numbers of the two child processes are the same.

      >>> music>>30304:
      coding>>43616:
      coding_father>>32496:
      lalal
      Coding...
      music_father>>32496:
      music...
      lalalmusic...
      
      Coding...
      lalal
      Coding...
      
  8. Global variables are not shared between processes

    1. In fact, creating a child process is to copy the resources of the main process to generate a new process , where the main process and the child process are independent of each other.

    2. example:

       #定义global varible
      import multiprocessing
      import time
      
      my_list = list()
      
      def write_data():
          for i in range(3):
              my_list.append(i)
              print("add",i)
          print("write_data",my_list)
              
      def read_data():
          print("read_data",my_list)
          
      if __name__ == "__main__":
          #创建写入数据的进程
          write_process = multiprocessing.Process(target = write_data)
          read_process = multiprocessing.Process(target = read_data)
          
          #启动进程
          write_process.start()
          time.sleep(1)
          #主进程等待写入进程执行完成以后的代码,再继续向往下执行
          #write_process.join()
          read_process.start()
      
      
      #执行结果
      add 0
      add 1
      add 2
      write_data [0, 1, 2]
      >>> read_data []
      

      The execution results show that the child processes are independent of each other and do not share global variables. It's like a little monkey turned from Monkey King's two fine hairs. Creating a child process will copy the resources of the main process, that is to say, the child process is a copy of the main process, like a pair of twins. The reason why the processes do not share global variables is because the global variables in the same process are not operated . It's just that the names of global variables in different processes are the same .

  9. Main process and child process end order

    1. For example WeChat:

      • The main process opens WeChat, opens the sub-process chat window, and when closing WeChat, first closes the chat window and then closes WeChat.
    2. example:

      def work():
          for i in range(10):
              print("工作中。。")
              time.sleep(0.2)
      
      
      if __name__ == '__main__':
          # 创建进程对象
          work_process = multiprocessing.Process(target=work)
      
          # 启动进程
          work_process.start()
          time.sleep(1)
          print("主进程结束")
      
      #执行结果表明:主进程会等待子进程结束后,再终结程序
      工作中。。
      工作中。。
      工作中。。
      工作中。。
      主进程结束
      工作中。。
      工作中。。
      工作中。。
      工作中。。
      工作中。。
      工作中。。
      
    3. Set up the guardian main process and destroy the child process (you can ensure that the child process is destroyed after the main process ends)

      1. example:

      2. After setting the guardian main process, the child process will be destroyed directly after the main process exits, and the code in the child process will no longer be executed.

        def work():
            for i in range(10):
                print("工作中。。")
                time.sleep(0.2)
        
        
        if __name__ == '__main__':
            # 创建进程对象
            work_process = multiprocessing.Process(target=work)
        	# 启动守护主进程
            work_process.daemon = True
            # 启动进程
            work_process.start()
            time.sleep(1)
            print("主进程结束")
        
        if __name__ == '__main__':
            # 创建进程对象
            work_process = multiprocessing.Process(target=work)
        	
            # 启动进程
            work_process.start()
            time.sleep(1)
            # 手动销毁子进程
            work_process.terminate()
            print("主进程结束")
        
        #上述代码得可直接结束主进程
        工作中。。
        工作中。。
        工作中。。
        工作中。。
        主进程结束
        

      Introduction to threads

      A process is the smallest unit of resource allocation . Once a process is created, certain resources will be allocated, just like two people need to open two qq software to chat on QQ.

      1. Why use multithreading:

        Thread is the smallest unit of program execution . In fact, the process is only responsible for allocating resources, and it is the thread that uses these resources to execute the program. That is to say, the process is the thread container. At least one thread in a process is responsible for executing the program. At the same time, the thread itself Does not own system resources, only needs a few resources that are essential in operation, but it can be used with **Other threads belonging to the same process share all resources owned by the process**. This is like opening multiple windows (multiple threads) to chat with multiple people through one QQ software (one process), which saves resources while realizing multitasking.

      2. The role of multithreading

        insert image description here

      Multi-threading to complete multi-tasking

      1. Thread creation steps

        1. import thread module

          import threading

        2. Create a thread object through the thread class

          Thread object = threading.Thread(target = task name)

          parameter name illustrate
          target The name of the target task to execute, here refers to the function name (method name)
          name Process name, usually not set
          group Process group, currently only None can be used
        3. Start the thread to execute the task

          thread object.start()

      2. Code for thread creation and startup

        # 创建子进程
        coding_process =  threading.Thread(target = coding)
        # 创建子进程
        music_process =  threading.Thread(target = music)
        # 启动进程
        coding_process.start()
        music_process.start()
        
      3. the case

        import time
        import threading
        import os
        
        #编写代码
        def coding():
            for i in range(3):
                print("Coding...")
                time.sleep(0.2)
        
        def music():
            for i in range(3):
                print("music...")
                time.sleep(0.2)
        
        if __name__ == '__main__':
            # coding()
            # music()
            #创建子线程
            coding_thread = threading.Thread(target = coding)
            #创建子线程
            music_thread = threading.Thread(target=music)
        
            coding_thread.start()
            music_thread.start()
        
        #执行结果
        Coding...
        music...
        Coding...
        music...
        Coding...
        music...
        
      4. A thread executes a task with parameters (same process, not an example)

      5. End order of main thread and child thread

        1. The main thread will wait for all the sub-threads to finish before ending

        2. example

          import time
          import threading
          import os
          
          def work():
              for i in range(10):
                  print("工作中。。")
                  time.sleep(0.2)
          
          if __name__ == '__main__':
          #     coding()
          #     music()
          # 创建进程对象
          #
              work_thread = threading.Thread(target=work)
             
              #启动进程
              work_thread.start()
              time.sleep(1)
              print("主线程结束")
          
          #结果同进程范例
          工作中。。
          工作中。。
          工作中。。
          工作中。。
          工作中。。
          主线程结束
          工作中。。
          工作中。。
          工作中。。
          工作中。。
          工作中。。
          
      6. Setting up the guardian main thread and destroying sub-threads (same process example)

        import time
        import threading
        
        def work():
            for i in range(10):
                print("work..")
                time.sleep(0.2)
        
        if __name__ == "__main__":
        
            work_thread = threading.Thread(target = work, daemon= True)
            work_thread.start()
        
            time.sleep(1)
            print("主线程执行完毕")
        
        import time
        import threading
        
        def work():
            for i in range(10):
                print("work..")
                time.sleep(0.2)
        
        if __name__ == "__main__":
        
            work_thread = threading.Thread(target = work)
            work_thread.setDaemon(True)
            work_thread.start()
        
            time.sleep(1)
            print("主线程执行完毕")
        

        The purpose of setting the guardian main thread is that the main thread exits the sub-thread and destroys it , preventing the main thread from waiting for the sub-thread to execute

        There are two ways to set the daemon main thread:

        1.threading.Thread(tarhet = name, daemon = True)

        2. Thread object.setDaemon(True)

      7. Execution order among threads

        1. When threads are executed out of order , it is determined by CPU scheduling that a thread will be executed first

        2. Get current thread information

          # 通过current_thread方法获取线程对象
          current_thread = threading.current_thread()
          # 通过打印current_thread可以知道当前的线程信息,例如被创建的顺序
          print(current_thread)
          
          import time
          import threading
          
          
          def get_info():
              time.sleep(0.5)
              # 获取线程信息
              current_thread = threading.current_thread()
              print(current_thread)
          
          if __name__ == '__main__':
              # 创建子线程
              for i in range(10):
                  sub_thread = threading.Thread(target = get_info) #注意是函数名称
                  sub_thread.start()
          
          #结果
          <Thread(Thread-1, started 20796)>
          <Thread(Thread-2, started 20740)>
          <Thread(Thread-3, started 38556)>
          <Thread(Thread-5, started 27744)>
          <Thread(Thread-4, started 13812)>
          <Thread(Thread-9, started 41968)>
          <Thread(Thread-6, started 26052)>
          <Thread(Thread-7, started 11520)>
          <Thread(Thread-8, started 10692)>
          <Thread(Thread-10, started 36928)>
          
      8. Share global variables between threads

        import threading
        import time
        
        # 定义global varible
        my_list = list()
        
        
        def write_data():
            for i in range(3):
                my_list.append(i)
                print("add", i)
            print("write_data", my_list)
        
        
        def read_data():
            print("read_data", my_list)
        
        
        if __name__ == "__main__":
            # 创建写入数据的进程
            write_thread = threading.Thread(target = write_data)
            read_thread = threading.Thread(target = read_data)
        
            # 启动进程
            write_thread.start()
            time.sleep(1)
            # 主进程等待写入进程执行完成以后的代码,再继续向往下执行
            # write_thread.join()
            read_thread.start()
        
        # 执行结果,验证了线程间是共享全局变量的
        add 0
        add 1
        add 2
        write_data [0, 1, 2]
        read_data [0, 1, 2]
        
      9. There is an error problem when sharing global variable data between threads

        • need

          1. Define two functions to achieve 1 million loops, and give the global variable +1 after each loop
          2. Create two sub-threads to execute the corresponding two functions, and view the calculated results
          glob_v = 0
          
          # 对全局变量加1
          def sum_num1():
              for i in range(1000000):# 分别测试 十万 与 一百万
                  global glob_v
                  glob_v += 1
              print("glob_v1",glob_v)
          
          def sum_num2():
              for i in range(1000000):
                  global glob_v
                  glob_v += 1
              print("glob_v2",glob_v)
          
          if __name__ == "__main__":
              sum1_thread = threading.Thread(target=sum_num1)
              sum2_thread = threading.Thread(target=sum_num2)
          
              # 启动线程
              sum1_thread.start()
              sum2_thread.start()
          
          #执行结果
          # 十万
          glob_v1 100000
          glob_v2 200000
          # 百万(出现错误)
          glob_v1 1166860
          glob_v2 1260019
          

          When a thread is working, another thread may also process the global variable while one thread is processing it, resulting in the addition of 1 being performed twice, but the actual value is only increased by 1.

        • solution

          • Thread synchronization, coordination and synchronization, running in a predetermined order, like walkie-talkies in real life, half-duplex communication
          • Use thread synchronization: ensure that only one thread can operate global variables at the same time
      10. Thread synchronization method: mutual exclusion lock

        1. def : Lock the shared data to ensure that only one thread can operate at the same time.

        2. Note: Mutual exclusion locks are grabbed by multiple threads together . The ones that grab the lock are executed first, and the threads that do not grab the lock wait. After they are used up and released, other waiting ones will grab the lock next time.

        3. Steps for usage

          1. Mutex creation

            mutex = thread.Lock()

          2. locked

            mutex.acquire()

          3. unlock

            mutex.relase()

        4. Routine:

          # 定义全局变量
          glob_v = 0
          
          # 对全局变量加1
          def sum_num1():
              # 上锁
              mutex.acquire()
              for i in range(1000000):# 分别测试 十万 与 一百万
                  global glob_v
                  glob_v += 1
              # 解锁
              mutex.release()
              print("glob_v1",glob_v)
          
          def sum_num2():
              # 上锁
              mutex.acquire()
              for i in range(1000000):  # 分别测试 十万 与 一百万
                  global glob_v
                  glob_v += 1
              # 解锁
              mutex.release()
              print("glob_v2",glob_v)
          
          if __name__ == "__main__":
              #创建锁
              mutex = threading.Lock()
              sum1_thread = threading.Thread(target=sum_num1)
          
              sum2_thread = threading.Thread(target=sum_num2)
          
              # 启动线程
              sum1_thread.start()
              sum2_thread.start()
              
          #执行结果
          # glob_v1 1000000
          # glob_v2 2000000
          
      11. deadlock

        • The situation of waiting for the other party to release the lock is a deadlock
        • Result: It will cause the application to stop responding, and will not be processed for other tasks!

        Pay attention to releasing the lock in actual work.

      12. process vs thread

        process thread
        Relationship comparison A process defaults to one thread, but multiple Threads are attached to processes, there are no threads without processes
        difference contrast 1. Global variables cannot be shared
        2. Creating a process resource is expensive
        3. The process is the basic unit of system resources
        1. Global variables can be used interoperably to prevent resource competition and deadlocks. Available solutions: mutex or thread synchronization
        2. The resource overhead of creating threads is very small
        3. Threads are the basic unit of CPU scheduling
        4. Threads cannot run automatically
        Comparison of advantages and disadvantages Can use multi-core, resource overhead The resource overhead is small, and multi-core cannot be used

Summarize

The blogger uses windows 10 to install VMare virtual machine, ubuntu version is 18.04, programming language is python3, IDEA is pycharm19.03 version, for reference only. Because the author is also a beginner, if there is any mistake, please correct me!

Guess you like

Origin blog.csdn.net/weixin_43357695/article/details/114683183