DIL Global Interpreter Lock

DIL Global Interpreter Lock

I. INTRODUCTION

  • GIL is essentially a mutex, mutex since it is the essence of all mutex are the same, all become serial will run concurrently, in order to control the sharing of data within the same time can only be modified by a task , thereby ensuring data security

  • Different data security protection, you should add different locks

  • GIL Global Interpreter Lock pros and cons

    • advantage
      • Ensure data security
    • Shortcoming
      • Under a single process, open multiple threads, sacrificing efficiency can not be achieved in parallel, can only be achieved concurrent
    import time
    from threading import Thread
    
    n = 100
    def task():
        global n
        m = n
        time.sleep(3)
        n = m - 1
    
    if __name__ == '__main__':
        list1 = []
        for line in range(10):
            t = Thread(target=task)
            t.start()
            list1.append(t)
        for t in list1:
            t.join()
        print(n)  # 99

Second, the use of multi-threading improve efficiency

"""
分析:
我们有四个任务需要处理,处理方式肯定是要玩出并发的效果,解决方案可以是:
方案一:开启四个进程
方案二:一个进程下,开启四个线程

单核情况下,分析结果: 
  如果四个任务是计算密集型,没有多核来并行计算,方案一徒增了创建进程的开销,方案二胜
  如果四个任务是I/O密集型,方案一创建进程的开销大,且进程的切换速度远不如线程,方案二胜

多核情况下,分析结果:
  如果四个任务是计算密集型,多核意味着并行计算,在python中一个进程中同一时刻只有一个线程执行用不上多核,方案一胜
  如果四个任务是I/O密集型,再多的核也解决不了I/O问题,方案二胜

 
结论:现在的计算机基本上都是多核,python对于计算密集型的任务开多线程的效率并不能带来多大性能上的提升,甚至不如串行(没有大量切换),但是,对于IO密集型的任务效率还是有显著提升的。
"""

Compute-intensive: high-efficiency multi-process

# 计算密集型:多进程效率高
from multiprocessing import Process
from threading import Thread
import os, time
def work():
    res = 0
    for i in range(10000000):
        res*=i

if __name__ == '__main__':
    l = []
    print(os.cpu_count())
    start_time = time.time()
    for i in range(4):
        p = Process(target=work)    # 消耗时间:1.021416187286377
        # p = Thread(target=work)   # 消耗时间:2.3506696224212646
        l.append(p)
        p.start()
    for p in l:
        p.join()
    stop_time = time.time()
    print(f'消耗时间:{stop_time - start_time}')

I / O-intensive: high-efficiency multi-threaded

# I/O密集型:多线程效率高
from multiprocessing import Process
from threading import Thread
import threading, os, time
def work():
    time.sleep(2)
    print('--->')

if __name__ == '__main__':
    l = []
    print(os.cpu_count())
    start_time = time.time()
    for i in range(400):
        # p = Process(target=work)  # 消耗时间:15.45025897026062
        p = Thread(target=work)     # 消耗时间:2.0674052238464355
        l.append(p)
        p.start()
    for p in l:
        p.join()
    stop_time = time.time()
    print(f'消耗时间:{stop_time - start_time}')
  • Scenarios
    • Multithreading for I / O intensive, such as socket, reptiles, web
    • Multi-process for compute-intensive, such as financial analysis

Third, coroutine

  • Process: Resource Unit
  • Thread: implementation units
  • Coroutine: single-threaded concurrency

In the case of IO-intensive, using a coroutine can improve maximum efficiency

  • advantage
    • Application-level switching speed is much higher than the operating system
  • Shortcoming
    • Once there is a blockage of multiple tasks is not cut, the whole thread blocking other tasks within the place of the thread can not be executed

Once introduced coroutines, it is necessary to detect all single-threaded behavior of IO, IO switches to achieve encounter, one less will not work, because once a task is blocked, the whole thread blocking, even if other tasks can be calculated, but can not run

  • Coroutine purpose
    • I want to achieve in a single-threaded concurrency
    • Concurrency refers to the multiple tasks at the same time appears to be running
    • = + Concurrent switching state holding
    • PS: manually implement "encounter IO + switch save the state" to deceive the operating system, the operating system to mistakenly think that no IO operation, the CPU will execute permissions for you
  • Features summary coroutine
    • It must be implemented concurrently in only a single thread in
    • Modify shared data does not need to be locked
    • Save your user program control flow stack multiple contexts
    • IO operation encountered a coroutine automatically switch to the other co-routines (how to detect IO, yield, greenlet can not be achieved, it uses gevent module)

Greenlet module Introduction

If we have 20 tasks within a single thread, in order to achieve switching between multiple tasks, using the yield Builder way too cumbersome (you need to get initialized once the generator, and then call send, a lot of trouble), and greenlet module can use a very simple direct switching of these 20 tasks

Genvent module Introduction

Genvent is a third-party library, you can easily implement synchronous or asynchronous concurrent programming by gevent, the main mode is used in gevent Greenlet, it is a form of access Python C extension module lightweight coroutines. Greenlet all run inside the main operating system processes, but they are collaborative amount scheduling

from gevent import monkey   # 猴子补丁
monkey.patch_all()  # 监听所有的任务是否有IO操作
from gevent import spawn    # spawn任务
from gevent import joinall
import time


def task1():
    print('start from task1...')
    time.sleep(1)
    print('end from task1...')


def task2():
    print('start from task2...')
    time.sleep(3)
    print('end from task2...')


def task3():
    print('start from task3...')
    time.sleep(5)
    print('end from task3...')

if __name__ == '__main__':
    start_time = time.time()
    sp1 = spawn(task1)
    sp2 = spawn(task2)
    sp3 = spawn(task3)
    joinall([sp1, sp2, sp3])

    end_time = time.time()

    print(f'消耗时间为:{end_time - start_time}')

Guess you like

Origin www.cnblogs.com/YGZICO/p/12025086.html