Thread supplementary knowledge

GIL Global Interpreter Lock

What is?

GIL is the essence of a mutex, will become concurrent serial and reduce efficiency in order to ensure the security of data

Each has a process, in-process there must be a thread to execute the code, will have to perform a garbage collection thread. To avoid code execution threads and the garbage collection thread of the same data operation, data confusion, this time need to have a lock, to ensure the same time only one thread in execution, this lock is called the global interpreter lock GIL .

After the thread grab GIL, you can run your own code using the Python interpreter.

Only CPython has GIL. ps: python interpreter, there are many: CPython, JPython, IronPython (C # achieve), PyPy (python achieve) ... The most common is the CPython interpreter

Why should GIL?

Because garbage collection mechanism CPython interpreter is not thread safe.

At the same time used to prevent multiple threads in the same process of execution (multiple threads within the same process can not be achieved in parallel, but can be implemented concurrently).

Python multi-threading can not take advantage of multi-core, multi-threaded so to what's the use?

Python study whether multithreading is useful needs with different situations:

Compute-intensive situation: to be a lot of computation, consuming CPU resources

Single-core: open thread more provincial resources

Multi-core: use the process, you can use the advantages of multi-core

IO intensive case: CPU consumption, and most of the time a task is waiting for IO operation is complete

Single-core: open thread more provincial resources

Multi-core: open thread, cheaper resources.

As the case, multi-process and multi-threaded is required to work.

# 计算密集型
from multiprocessing import Process
from threading import Thread
import os,time
def work():
    res=0
    for i in range(100000000):
        res*=i


if __name__ == '__main__':
    l=[]
    print(os.cpu_count())  # 本机为6核
    start=time.time()
    for i in range(6):
        # p=Process(target=work) #耗时  4.732933044433594
        p=Thread(target=work) #耗时 22.83087730407715
        l.append(p)
        p.start()
    for p in l:
        p.join()
    stop=time.time()
    print('run time is %s' %(stop-start))
# IO密集型
from multiprocessing import Process
from threading import Thread
import threading
import os,time
def work():
    time.sleep(2)


if __name__ == '__main__':
    l=[]
    print(os.cpu_count()) #本机为6核
    start=time.time()
    for i in range(4000):
        p=Process(target=work) #耗时9.001083612442017s多,大部分时间耗费在创建进程上
        # p=Thread(target=work) #耗时2.051966667175293s多
        l.append(p)
        p.start()
    for p in l:
        p.join()
    stop=time.time()
    print('run time is %s' %(stop-start))

Synchronization lock, mutex

Sync: access to resources must be in accordance with certain provisions of the order have access

Exclusive: When you access the same resource at the same time only one visitor.

A special synchronization mutually exclusive, but mutually exclusive synchronization is more complex.

Mutex Features: Only one thread has a lock, other locks can only wait.

join vs mutex

join is waiting for all, that the whole serial, which is equivalent to lock all of the code

Shared data is modified to lock the lock portion, i.e. portion serial code corresponding to lock only modify shared data portion

To get fundamental principle of data security is to become a concurrent serial that join and lock can do it, but, no doubt, more efficient locks!

#不加锁:并发执行,速度快,数据不安全
from threading import current_thread,Thread,Lock
import os,time
def task():
    global n
    print('%s is running' %current_thread().getName())
    temp=n
    time.sleep(0.5)
    n=temp-1


if __name__ == '__main__':
    n=100
    lock=Lock()
    threads=[]
    start_time=time.time()
    for i in range(100):
        t=Thread(target=task)
        threads.append(t)
        t.start()
    for t in threads:
        t.join()

    stop_time=time.time()
    print('主:%s n:%s' %(stop_time-start_time,n))

'''
Thread-1 is running
Thread-2 is running
......
Thread-100 is running
主:0.5216062068939209 n:99
'''


#不加锁:未加锁部分并发执行,加锁部分串行执行,速度慢,数据安全
from threading import current_thread,Thread,Lock
import os,time
def task():
    #未加锁的代码并发运行
    time.sleep(3)
    print('%s start to run' %current_thread().getName())
    global n
    #加锁的代码串行运行
    lock.acquire()
    temp=n
    time.sleep(0.5)
    n=temp-1
    lock.release()

if __name__ == '__main__':
    n=100
    lock=Lock()
    threads=[]
    start_time=time.time()
    for i in range(100):
        t=Thread(target=task)
        threads.append(t)
        t.start()
    for t in threads:
        t.join()
    stop_time=time.time()
    print('主:%s n:%s' %(stop_time-start_time,n))

'''
Thread-1 is running
Thread-2 is running
......
Thread-100 is running
主:53.294203758239746 n:0
'''

#有的同学可能有疑问:既然加锁会让运行变成串行,那么我在start之后立即使用join,就不用加锁了啊,也是串行的效果啊
#没错:在start之后立刻使用jion,肯定会将100个任务的执行变成串行,毫无疑问,最终n的结果也肯定是0,是安全的,但问题是
#start后立即join:任务内的所有代码都是串行执行的,而加锁,只是加锁的部分即修改共享数据的部分是串行的
#单从保证数据安全方面,二者都可以实现,但很明显是加锁的效率更高.
from threading import current_thread,Thread,Lock
import os,time
def task():
    time.sleep(3)
    print('%s start to run' %current_thread().getName())
    global n
    temp=n
    time.sleep(0.5)
    n=temp-1


if __name__ == '__main__':
    n=100
    lock=Lock()
    start_time=time.time()
    for i in range(100):
        t=Thread(target=task)
        t.start()
        t.join()
    stop_time=time.time()
    print('主:%s n:%s' %(stop_time-start_time,n))

'''
Thread-1 start to run
Thread-2 start to run
......
Thread-100 start to run
主:350.6937336921692 n:0 #耗时是多么的恐怖

GIL vs mutex

Ask a question: Python has a GIL same time to ensure that only one thread running, why do we need a mutex?

First, the purpose is to protect the lock shared data at the same time only one thread to modify the shared data.

Therefore: different share data require different locks to protect

Conclusion: GIL protection is the interpreter-level data (garbage collection), mutex protection of the user's own custom shared data.

from threading import Thread,Lock
import os,time
def work():
    global n
    lock.acquire()
    temp=n
    time.sleep(0.1)
    n=temp-1
    lock.release()
if __name__ == '__main__':
    lock=Lock()
    n=100
    l=[]
    for i in range(100):
        p=Thread(target=work)
        l.append(p)
        p.start()
    for p in l:
        p.join()

    print(n) #结果肯定为0,由原来的并发执行变成串行,牺牲了执行效率保证了数据安全
    
'''
分析:
1.100个线程去抢GIL锁,即抢执行权限
2. 肯定有一个线程先抢到GIL(暂且称为线程1),然后开始执行,一旦执行就会拿到lock.acquire()
3. 极有可能线程1还未运行完毕,就有另外一个线程2抢到GIL,然后开始运行,但线程2发现互斥锁lock还未被线程1释放,于是阻塞,被迫交出执行权限,即释放GIL
4.直到线程1重新抢到GIL,开始从上次暂停的位置继续执行,直到正常释放互斥锁lock,然后其他的线程再重复2 3 4的过程
'''

Deadlocks, recursive lock

What is a deadlock?

Two or more processes or threads, due to competition for resources caused by the phenomenon of waiting for each other.

from threading import Thread, Lock
import time

mutexA = Lock()
mutexB = Lock()

class MyThread(Thread):
    def run(self):
        self.fun1()
        self.fun2()

    def fun1(self):
        mutexA.acquire()
        print('{}拿到A锁'.format(self.name))

        mutexB.acquire()
        print('{}拿到B锁'.format(self.name))

        mutexB.release()
        mutexA.release()

    def fun2(self):
        mutexB.acquire()
        print('{}拿到B锁'.format(self.name))

        time.sleep(2)
        mutexA.acquire()
        print('{}拿到A锁'.format(self.name))
        mutexA.release()
        mutexB.release()


if __name__ == '__main__':
    for i in range(10):
        t = MyThread()
        t.start()

In order to resolve the deadlock, designed recursive lock (RLock).

This internal RLock maintains a Lock and a counter variable, counter records the number of times acquire, so that resources can be many times require. Acquire a thread until all have been release, other threads to get resources. The above example if instead of using RLock Lock, the deadlock will not occur:

mutexA=mutexB=threading.RLock() #一个线程拿到锁,counter加1,该线程内又碰到加锁的情况,则counter继续加1,这期间所有其他线程都只能等待,等待该线程释放所有锁,即counter递减到0为止

Semaphore Semaphore

Mutual exclusion semaphore, mutex is a special case

from threading import Thread,Semaphore
import threading
import time


# def func():
#     if sm.acquire():
#         print (threading.currentThread().getName() + ' get semaphore')
#         time.sleep(2)
#         sm.release()
def func():
    sm.acquire()
    print('%s get sm' %threading.current_thread().getName())
    time.sleep(3)
    sm.release()


if __name__ == '__main__':
    sm = Semaphore(5)
    for i in range(23):
        t = Thread(target=func)
        t.start()

The mutex semaphore and toilet metaphor.

The mutex is only one toilet pit-bit, other people want to enter, you have to wait outside.

The semaphore toilet pits have more bits, such as the pit is filled, others want to go can only line up outside.

Event Event

Key Features threads: each thread is run independently, and an unpredictable state.

Event To solve the problem? When a thread needs to be determined by determining their next state in another thread.

from threading import Thread, Event
import time

e = Event()

def light():
    print('红灯亮')
    time.sleep(3)
    e.set() # 发信号
    print('绿灯亮')


def car(name):
    print('{}正在等红灯'.format(name))
    e.wait() # 等待信号
    print('冲冲冲')

t = Thread(target=light)
t.start()

for i in range(10):
    t = Thread(target=car, args=(('汽车{}'.format(i)),))
    t.start()

Timer

Perform an operation specified after n seconds

from threading import Timer
 
 
def hello():
    print("hello, world")
 
t = Timer(1, hello)
t.start()  # after 1 seconds, "hello, world" will be printed
from threading import Timer
import random,time

class Code:
    def __init__(self):
        self.make_cache()

    def make_cache(self,interval=5):
        self.cache=self.make_code()
        print(self.cache)
        self.t=Timer(interval,self.make_cache)
        self.t.start()

    def make_code(self,n=4):
        res=''
        for i in range(n):
            s1=str(random.randint(0,9))
            s2=chr(random.randint(65,90))
            res+=random.choice([s1,s2])
        return res

    def check(self):
        while True:
            inp=input('>>: ').strip()
            if inp.upper() ==  self.cache:
                print('验证成功',end='\n')
                self.t.cancel()
                break


if __name__ == '__main__':
    obj=Code()
    obj.check()

Thread queue

import queue

q=queue.Queue()
q.put('first')
q.put('second')
q.put('third')

print(q.get())
print(q.get())
print(q.get())
'''
结果(先进先出):
first
second
third
'''

class queue.LifoQueue(maxsize=0) #last in fisrt out

import queue

q=queue.LifoQueue()
q.put('first')
q.put('second')
q.put('third')

print(q.get())
print(q.get())
print(q.get())
'''
结果(后进先出):
third
second
first
'''
复制代码

class queue.PriorityQueue (maxsize = 0) can be set to the priority queue for storing data #


import queue

q=queue.PriorityQueue()
#put进入一个元组,元组的第一个元素是优先级(通常是数字,也可以是非数字之间的比较),数字越小优先级越高
q.put((20,'a'))
q.put((10,'b'))
q.put((30,'c'))

print(q.get())
print(q.get())
print(q.get())
'''
结果(数字越小优先级越高,优先级高的优先出队):
(10, 'b')
(20, 'a')
(30, 'c')
'''

Guess you like

Origin www.cnblogs.com/KbMan/p/11354538.html