Concurrent Programming lock problem

Concurrent Programming lock problem

 Python GIL锁(Global Interpreter Lock)

GIL lock the official document describes

'''
定义:
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple 
native threads from executing Python bytecodes at once. This lock is necessary mainly 
because CPython’s memory management is not thread-safe. (However, since the GIL 
exists, other features have grown to depend on the guarantees that it enforces.)
翻译:在cpython中,全局解释器锁(gil)是一个互斥锁,它可以防止同时执行python字节码的本机线程。这个锁主要是必要的因为cpython的内存管理不是线程安全的。(然而,自从GIL锁的存在,其他特征已经发展到取决于它所实施的保证。)

Conclusion: Cpython interpreter, multi-threaded open under the same process, the same time only one thread of execution, unable to take advantage of multi-core advantage.

First need to clear is GILnot a Python features, it is a concept at the time of implementation of the Python parser (CPython) introduced. Because CPython is the default under most environmental Python execution environment. So CPython is Python, there is the concept taken for granted in many people's GILattributed to the Python language deficiencies. So here must first be clear: GIL is not a Python features, Python can not rely on the GIL.

 GIL lock Introduction

GIL is essentially a mutex, mutex since it is the essence of all mutex are the same, all become serial will run concurrently, in order to control the sharing of data within the same time can only be modified by a task , thus ensuring data security.

To be sure of is this: to protect the safety of different data, you should add different locks.

To understand GIL, first determine one thing: each time the python program, it will have a separate process. For example python test.py, python aaa.py, python bbb.py will produce three different processes python

'''
#验证python test.py只会产生一个进程
#test.py内容
import os,time
print(os.getpid())  # 这个就是Python解释器的进程号
time.sleep(1000)
'''
python3 test.py 
#在windows下
tasklist |findstr python  
#在linux下
ps aux |grep python

A python in the process, not only the main thread test.py or the other by the main thread Chengkai Qi, interpreter-level thread as well as the interpreter turned on garbage collection, etc. , in short, all the threads are running in this process inside, there is no doubt

1 所有数据都是共享的,这其中,代码作为一种数据也是被所有线程共享的(test.py的所有代码以及Cpython解释器的所有代码)
例如:test.py定义一个函数work(代码内容如下图),在进程内所有线程都能访问到work的代码,于是我们可以开启三个线程然后target都指向该代码,能访问到意味着就是可以执行。

2.所有线程的任务,都需要将任务的代码当做参数传给解释器的代码去执行,即所有的线程要想运行自己的任务,首先需要解决的是能够访问到解释器码的代码

In summary:

If multiple threads of target = work, then the process is executed

Multiple threads first visit to the interpreter code that is executed to get the permissions, and then to the target code interpreter to execute code
interpreter code is shared by all threads, so the garbage collector thread may also have access to an interpreter code execution away, which leads to a question: 100 for the same data, may execute thread 1 x = 100 at the same time, while garbage collection is performed by a clever way to recovery operations 100, this problem is not solved, lock processing is, as shown in the GIL, the python interpreter to ensure the same time only one task execution code.

GIL lock and Lock

GIL protection stage is to explain data, to protect the user will need their own data lock processing, as in FIG.

GIL and multithreading

GIL exists, the same process in the same time only one thread is executed

Hearing this, some students immediately asked: The process can take advantage of multi-core, but the big overhead, and python multithreading small overhead, but could not take advantage of multi-core advantage, that is not the case python useless?! ! ! not like this! ! ! not like this! ! !

To solve this problem, we need to agree on several points:

#1. cpu到底是用来做计算的,还是用来做I/O的?

#2. 多cpu,意味着可以有多个核并行完成计算,所以多核提升的是计算性能

#3. 每个cpu一旦遇到I/O阻塞,仍然需要等待,所以多核对I/O操作没什么用处

Personal understood that: In the case of a multi-core, if the I / O-intensive procedures: Open multi-process, a plurality of processes performed in parallel (according to the number of the native CPU core, a process takes a CPU), a copy within each process Python interpreter code, multiple threads within the same process and then try to steal interpreter lock (GIL lock), then multiple threads within the same process is executed concurrently, each thread encounter I / O obstruction will relieve CPU occupation authority (+ switch save the state). I / O block time is constant, when performing a plurality of processes, experience time slice I / O or partial obstruction to run out, the CPU must release, into the queue to the next level (multilevel feedback column) among ready state, waiting to be scheduled again, as scheduled again, the use of which core is not fixed,

Corresponds to a worker cpu, calculated at this time corresponds to the worker at work, I / O corresponding to the blocking process to provide the necessary raw materials for the worker, and the worker who process the raw materials, if not, the process is for the worker We need to stop, wait until the arrival of raw materials.

If the majority of tasks you should have factories dry raw material preparation process (I / O-intensive), then you have more workers, and not be very meaningful, not like a person, so that workers in the process such as materials to do otherwise live,

Conversely, if you plant raw materials are complete, of course, is that the more the worker, the more efficient

in conclusion:

  For calculation, cpu better, but for I / O, it is more useless cpu

  Of course, running a program, the cpu with increased efficiency will certainly improve (no matter how much the magnitude of increase, there will always be improved), it is because a program is not substantially pure or pure computing I / O so we can only see a relative of a program in the end is a compute-intensive or I / O intensive, multi-threaded python further analysis of whether in the end useless

#分析:
我们有四个任务需要处理,处理方式肯定是要玩出并发的效果,解决方案可以是:
方案一:开启四个进程
方案二:一个进程下,开启四个线程

#单核情况下,分析结果: 
  如果四个任务是计算密集型,没有多核来并行计算,方案一徒增了创建进程的开销,方案二胜
  如果四个任务是I/O密集型,方案一创建进程的开销大,且进程的切换速度远不如线程,方案二胜

#多核情况下,分析结果:
  如果四个任务是计算密集型,多核意味着并行计算,在python中一个进程中同一时刻只有一个线程执行用不上多核,方案一胜
  如果四个任务是I/O密集型,再多的核也解决不了I/O问题,方案二胜
 
#结论:现在的计算机基本上都是多核,python对于计算密集型的任务开多线程的效率并不能带来多大性能上的提升,甚至不如串行(没有大量切换),但是,对于IO密集型的任务效率还是有显著提升的。

Multi-threaded and multi-process performance comparison (I / O-intensive and computationally intensive)

Examples of multi-core CPU under what circumstances are:

# 计算密集型的情况下,开启多进程效率更高

from multiprocessing import Process
from threading import Thread
import os,time
def work():
    res=0
    for i in range(100000000):
        res*=i
    print(res)

if __name__ == '__main__':
    l=[]
    print(os.cpu_count()) #本机为4核
    start=time.time()
    for i in range(4):
        # p=Process(target=work) #耗时15.843906164169312
        p=Thread(target=work) #耗时26.057490348815918
        l.append(p)
        p.start()
    for p in l:
        p.join()
    stop=time.time()
    print('run time is %s' %(stop-start))
from multiprocessing import Process
from threading import Thread
import threading
import os,time
def work():
    time.sleep(2)
    print('===>')

if __name__ == '__main__':
    l=[]
    print(os.cpu_count()) #本机为4核
    start=time.time()
    for i in range(50):
        # p=Process(target=work) #耗时12.597720623016357,大部分时间耗费在创建进程上
        p=Thread(target=work) #耗时 2.013115167617798
        l.append(p)
        p.start()
    for p in l:
        p.join()
    stop=time.time()
    print('run time is %s' %(stop-start))

application:

Multithreading for IO-intensive, such as socket, reptiles, web
multiple processes for compute-intensive, such as financial analysis

Deadlock with recursive lock (resolve the deadlock)

The so-called deadlock: refers to the phenomenon of two or more processes or threads in the implementation process, a result of competition for resources caused by waiting for each other, in the absence of external force, they will not be able to promote it. At this time, say the system is in deadlock state or system to produce a deadlock, which is always in the process of waiting for another process called the deadlock, deadlock is as follows

# 这里以开启100个线程为例
from threading import Thread,Lock,RLock
import time

# mutexA=Lock()
# mutexB=Lock()
mutexB=mutexA=RLock()  # 递归锁


class Mythead(Thread):
    def run(self):
        self.f1()
        self.f2()

    def f1(self):
        mutexA.acquire()
        print('%s 抢到A锁' %self.name)
        mutexB.acquire()
        print('%s 抢到B锁' %self.name)
        mutexB.release()
        mutexA.release()

    def f2(self):
        mutexB.acquire()
        print('%s 抢到了B锁' %self.name)
        time.sleep(2)
        mutexA.acquire()
        print('%s 抢到了A锁' %self.name)
        mutexA.release()
        mutexB.release()

if __name__ == '__main__':
    for i in range(100):
        t=Mythead()
        t.start()

'''
Thread-1 拿到A锁
Thread-1 拿到B锁
Thread-1 拿到B锁
Thread-2 拿到A锁
然后就卡住,死锁了
'''

Solution, recursive lock, in order to support Python, in the same thread multiple requests for the same resource, python provides reentrant lock RLock.

This internal RLock maintains a Lock and a counter variable, counter records the number of times acquire, so that resources can be many times require. Acquire a thread until all have been release, other threads can grab the lock again. The above example if instead of using RLock Lock, the deadlock will not occur:

mutexA=mutexB=threading.RLock() #一个线程拿到锁,counter加1,该线程内又碰到加锁的情况,则counter继续加1,这期间所有其他线程都只能等待,等待该线程释放所有锁,即counter递减到0为止

signal

Mutex is similar to more than one person (process / thread) grab a toilet (a lock), the same time only one person to use the toilet (code execution), and the signal is similar to the amount of public toilets, public toilets allow multiple people to rush inside pits bit key (while meeting multiple threads or processes executing code), allowing simultaneous multiple threads or processes try to steal the lock

from threading import Thread,Semaphore
import time,random
sm=Semaphore(5)  # 设置5个坑位,同时满足5个线程执行,运行完几个线程,待执行的线程补上对应的位置执行。

def task(name):
    sm.acquire()
    print('%s 正在上厕所' %name)
    time.sleep(random.randint(1,3))
    sm.release()

if __name__ == '__main__':
    for i in range(20):
        t=Thread(target=task,args=('路人%s' %i,))
        t.start()

EVENT Event

event.isSet():返回event的状态值;

event.wait():如果 event.isSet()==False将阻塞线程;

event.set(): 设置event的状态值为True,所有阻塞池的线程激活进入就绪状态, 等待操作系统调度;

event.clear():恢复event的状态值为False。

from threading import Thread,Event
import time

event=Event()

def light():
    print('红灯正亮着')
    time.sleep(3)
    event.set() #绿灯亮

def car(name):
    print('车%s正在等绿灯' %name)
    event.wait() #等灯绿
    print('车%s通行' %name)

if __name__ == '__main__':
    # 红绿灯
    t1=Thread(target=light)
    t1.start()
    # 车
    for i in range(10):
        t=Thread(target=car,args=(i,))
        t.start()
        
  '''
  红灯正亮着
车0正在等绿灯
车1正在等绿灯
车2正在等绿灯
车3正在等绿灯
车4正在等绿灯
车5正在等绿灯
车6正在等绿灯
车7正在等绿灯
车8正在等绿灯
车9正在等绿灯
车1通行
车2通行
车4通行
车5通行
车7通行
车8通行
车0通行
车3通行
车9通行
车6通行
  '''

The three types of thread queue

Distributed completed by the middle of a shared queue, and then complete the information interaction; whether the communication process or thread, as long as the communication network, are based on the communication network of the queue

queue.Queue() # 先进先出
queue.LifoQueue() # 后进先出->堆栈
queue.PriorityQueue() # 优先级
import queue

queue.Queue() #先进先出
q=queue.Queue(3)
q.put(1)
q.put(2)
q.put(3)
print(q.get())
print(q.get())
print(q.get())
'''
1
2
3
'''
import queue

queue.LifoQueue() #后进先出->堆栈
q=queue.LifoQueue(3)
q.put(1)
q.put(2)
q.put(3)
print(q.get())
print(q.get())
print(q.get())
'''
3
2
1
'''
import queue

q=queue.PriorityQueue(3) #优先级,优先级用数字表示,数字越小优先级越高
q.put((10,'a'))
q.put((-1,'b'))
q.put((100,'c'))
print(q.get())
print(q.get())
print(q.get())

'''
(-1, 'b')
(10, 'a')
(100, 'c')
'''

Guess you like

Origin www.cnblogs.com/zhangchaocoming/p/11735912.html