The python Network Programming - global interpreter lock

GIL lock

  1. GIL lock

    • Definition: The global interpreter lock is a mutex, will become concurrent serial, only one thread at a time using an interpreter resources, sacrificing efficiency, ensure data security interpreter.

    • py file execution process in memory:

      • Py file when executed, it will open a process in memory
      • Process not only includes documents as well as python interpreter py, py file in the code thread will be handed over to the interpreter,
      • Interpreter python code into C language code byte can be identified, and then to a virtual machine interpreter bytecode into a binary code to the CPU performs the final

      As shown below:

      When the first thread to get GIL lock thread 1 2 3 thread can only wait, when the execution thread 1 encounters an obstruction or a period of time the CPU executes, thread 1 will be suspended, while the GIL lock is released, this 2 or 3 threads thread will get locked into the interpreter, too, when faced with an obstruction or execution after a period of time the CPU executes suspended, while the GIL lock will be released at this time last thread will enter the interpreter .

      As can be seen from the above, when faced with a single process containing a plurality of threads, due to the presence of the locks GIL, CPython not use multi-core parallel processing, but may be implemented concurrently on a single core.

      However, multi-threaded multi-process can make use of multiple cores.

    • Role: 1 to ensure the security of the data inside the interpreter; 2 locking force, reduce the development burden.

    • Question: multithreaded single process can not take advantage of multi-core

    • How to determine what circumstances the use of multi-threaded and multi-process concurrency

      For calculation, cpu better, but for I / O, it is more useless cpu

        Of course, running a program, the cpu with increased efficiency will certainly improve (no matter how much the magnitude of increase, there will always be improved), it is because a program is not substantially pure or pure computing I / O , it should be calculated to see a program in the end opposite the intensive or I / O intensive, as follows:

      #分析:
      我们有四个任务需要处理,处理方式肯定是要达到并发的效果,解决方案可以是:
      方案一:开启四个进程
      方案二:一个进程下,开启四个线程
      
      #单核情况下,分析结果: 
        如果四个任务是计算密集型,没有多核来并行计算,方案一徒增了创建进程的开销,方案二胜
        如果四个任务是I/O密集型,方案一创建进程的开销大,且进程的切换速度远不如线程,方案二胜
      
      #多核情况下,分析结果:
        如果四个任务是计算密集型,多核意味着并行计算,在python中一个进程中同一时刻只有一个线程执行,可以利用多核,方案一胜
        如果四个任务是I/O密集型,再多的核也解决不了I/O问题,方案二胜
      
      
      #结论:现在的计算机基本上都是多核,python对于计算密集型的任务开多线程的效率并不能带来多大性能上的提升,甚至不如串行(没有大量切换),但是,对于IO密集型的任务效率还是有显著提升的。

      Summary: Under the premise of multi-core, if the IO-intensive tasks, the use of multi-threaded; if the task is computationally intensive, using more complicated process.

  2. Verification of concurrent efficiency Cpython

    • Compute-intensive task belongs

      from multiprocessing import Process
      from threading import Thread
      import os,time
      def work():
          res=0
          for i in range(100000000):
              res*=i
      if __name__ == '__main__':
          l=[]
          print(os.cpu_count()) #本机为8核
          start=time.time()
          for i in range(8):
              p=Process(target=work) #耗时7s多
              # p=Thread(target=work) #耗时15s多
              l.append(p)
              p.start()
          for p in l:
              p.join()
          stop=time.time()
          print('run time is %s' %(stop-start))

      It can be seen when the task is computationally intensive, more complicated process more efficient than multi-threaded

    • IO-intensive task belongs

      from multiprocessing import Process
      from threading import Thread
      import threading
      import os,time
      def work():
          time.sleep(2)
          print('===>')
      
      if __name__ == '__main__':
          l=[]
          print(os.cpu_count()) #本机为4核
          start=time.time()
          for i in range(40):
              # p=Process(target=work) #耗时5s多,大部分时间耗费在创建进程上
              p=Thread(target=work) #耗时2s多
              l.append(p)
              p.start()
          for p in l:
              p.join()
          stop=time.time()
          print('run time is %s' %(stop-start))

      It can be seen when the task is IO intensive, multi-threaded efficient than the more complicated process.

  3. GIL relationship with the mutex lock

    1. GIL lock protection is the interpreter of data security; mutex protection is to secure data in the file.
    2. GIL lock automatically locked file mutex Lock will need to manually lock and unlock.

    All threads are compute-intensive: When the program executes, open 100 thread, the first thread must first get the GIL lock, and then get the lock lock lock lock release, the last release GIL lock.

    Line said all IO-intensive: When the program executes, open 100 thread, the first thread must first get the GIL lock, and then get the lock latch, when faced with IO, CPU cut away, while the GIL lock release, The second process to get into the GIL lock, due lock lock has not been released, will be blocked pending, empathy, third ....

    Summary: add their own mutex, must be added in place dealing with shared data, plus a range of not expanding.

  4. Process pool and thread pool

    Process pool: a container placement process.

    Thread Pool: Place a container thread.

    Cases, the use of multi-threaded socket communication is completed:

    import socket
    from threading import Thread
    
    def communication(conn):
        while 1:
            try:
                from_client_data = conn.recv(1024)  # 阻塞
                print(from_client_data.decode('utf-8'))
                to_client_data = input('>>>').strip()
                conn.send(to_client_data.encode('utf-8'))
            except Exception:
                break
        conn.close()
    
    def customer_service():
    
        server = socket.socket()
        server.bind(('127.0.0.1', 8080))
        server.listen()
        while 1:
            conn,addr = server.accept()  # 阻塞
            print(f'{addr}客户:')
            t = Thread(target=communication,args=(conn,))
            t.start()
        server.close()
    
    if __name__ == '__main__':
        customer_service()

    Although the use of multi-threading enables communication with multiple clients, but in fact can not be unlimited open thread, so should be done to limit the number of threads (or processes) to open more threads (or processes) in the case of the computer can be met . At this point you must use thread pool (or process pool). as follows:

    import socket
    from concurrent.futures import ThreadPoolExecutor
    
    def communication(conn):
        while 1:
            try:
                from_client_data = conn.recv(1024)  # 阻塞
                print(from_client_data.decode('utf-8'))
                to_client_data = input('>>>').strip()
                conn.send(to_client_data.encode('utf-8'))
            except Exception:
                break
        conn.close()
    
    def customer_service(t):
    
        server = socket.socket()
        server.bind(('127.0.0.1', 8080))
        server.listen()
        while 1:
            conn,addr = server.accept()  # 阻塞
            print(f'{addr}客户:')
            t.submit(communication,conn)
        server.close()
    
    if __name__ == '__main__':
    
        t = ThreadPoolExecutor(2)
        customer_service(t)

    Thread pool and semaphores What is the difference?

    The number of threads the thread pool is used to control the actual work to reduce the memory overhead through a thread multiplexing. The number of threads in the thread pool can work at the same time it is certain, than the number of threads need to go to the thread queue to wait until there is an available worker to perform the task.

    Use Seamphore, how many threads you create, how many threads there will actually be implemented, but the number of threads that can be executed at the same time will be limited. But using a thread pool thread you just created to be submitted as a task execution thread pool threads actual work created by the thread pool, and the actual number of threads working thread pool managed by themselves.

    https://blog.csdn.net/mryang125/article/details/81490783

Guess you like

Origin www.cnblogs.com/yaoqi17/p/11260062.html