Thread queue, Event event, coroutine
Thread queue
FIFO Example:
import queue #不需要通过threading模块里面导入,直接import queue就可以了,这是python自带的 #用法基本和我们进程multiprocess中的queue是一样的 q=queue.Queue() q.put('first') q.put('second') q.put('third') # q.put_nowait() #没有数据就报错,可以通过try来搞 print(q.get()) print(q.get()) print(q.get()) # q.get_nowait() #没有数据就报错,可以通过try来搞 ''' 结果(先进先出): first second third '''
Last-out (Lifo stack) Example:
import queue q=queue.LifoQueue() #队列,类似于栈,栈我们提过吗,是不是先进后出的顺序啊 q.put('first') q.put('second') q.put('third') # q.put_nowait() print(q.get()) print(q.get()) print(q.get()) # q.get_nowait() ''' 结果(后进先出): third second first '''
Example priority queue
import queue q=queue.PriorityQueue() #put进入一个元组,元组的第一个元素是优先级(通常是数字,也可以是非数字之间的比较),数字越小优先级越高 q.put((-10,'a')) q.put((-5,'a')) #负数也可以 # q.put((20,'ws')) #如果两个值的优先级一样,那么按照后面的值的acsii码顺序来排序,如果字符串第一个数元素相同,比较第二个元素的acsii码顺序 # q.put((20,'wd')) # q.put((20,{'a':11})) #TypeError: unorderable types: dict() < dict() 不能是字典 # q.put((20,('w',1))) #优先级相同的两个数据,他们后面的值必须是相同的数据类型才能比较,可以是元祖,也是通过元素的ascii码顺序来排序 q.put((20,'b')) q.put((20,'a')) q.put((0,'b')) q.put((30,'c')) print(q.get()) print(q.get()) print(q.get()) print(q.get()) print(q.get()) print(q.get()) ''' 结果(数字越小优先级越高,优先级高的优先出队): '''
These three cases are thread safe queues, multiple threads will not seize the same resources or data: Summary
Event Event
- Description: open two threads, one thread runs into the middle of a stage, triggering another thread execute two threads increases the coupling
Event method:
event.isSet():#返回event的状态值; event.wait():#如果 event.isSet()==False将阻塞线程; event.set(): #设置event的状态值为True,所有阻塞池的线程激活进入就绪状态, 等待操作系统调度; event.clear():#恢复event的状态值为False。
Event without incident, detect a connection to the server thread, a thread connection to the server
from threading import Thread from threading import current_thread import time flag=False#定义一个全局变量表示当前状态 def check(): print(f'{current_thread().name}检测服务器是否开启') time.sleep(3)#睡三秒 global flag#使用global声明修改全局变量 flag=True#讲flag改为true print('服务器已开启') def connect(): while 1: print(f'{current_thread().name}等待连接') time.sleep(0.5) if flag: print(f'{current_thread().name}连接成功') break t1=Thread(target=check) t2=Thread(target=connect) t1.start() t2.start() # 结果: Thread-1监测服务器是否开启 Thread-2等待连接 Thread-2等待连接 Thread-2等待连接 Thread-2等待连接 Thread-2等待连接 Thread-2等待连接 服务器已开启 Thread-2 连接成功...
Use Event events:
from threading import Thread from threading import current_thread from threading import Event import time event=Event()#实例化Event事件 event默认为False def check(): print(f'{current_thread().name}检测服务器连接') time.sleep(3)#休息三秒 event.set()#将event改为True print(f'{current_thread().name}服务器已开启') def connect(): print(f'{current_thread().name}等待连接') event.wait()#判断event是否为True print(f'{current_thread().name}连接成功') t1=Thread(target=check,) t2=Thread(target=connect,) t1.start() t2.start()
Whether to start a thread server monitoring, to determine if another thread starts, the display connection is successful, only connect three times, 1s once, not successful show failure
from threading import Thread from threading import current_thread from threading import Event import time event=Event() def check(): print(f'{current_thread().name}检查服务器连接') time.sleep(4) event.set() print(f'{current_thread().name}服务器已开启') def connect(): count=1 print(f'{current_thread().name}等待服务器连接') while count<=3: event.wait(1) if not event.is_set(): print(f'{current_thread().name}尝试连接{count}') count+=1 else: print(f'{current_thread().name}连接成功') break else: print(f'{current_thread().name}连接失败') t1=Thread(target=check) t2=Thread(target=connect) t1.start() t2.start()
Coroutine
What is the coroutine?
Coroutine : a thread concurrent processing tasks
Serial : a thread executing a task, after the execution, the next task
Parallel : a plurality of cpu perform multiple tasks, 4 tasks execute four cpu
Concurrency : a cpu to perform multiple tasks at the same time looks like a run
Concurrent's real core : switch and hold
Multithreading : 3-thread processing task 10, if the thread 1 handle this task, met obstruction, cpu by the operating system to switch to another thread
A thread concurrent processing tasks: a thread to perform three tasks, for example:
Association Process Definition: coroutine lightweight thread is a user-state, i.e. coroutine is controlled by the user program scheduling themselves.
Single cpu concurrent execution of 10 missions in three ways:
1, way: open multi-process concurrent execution, operating system switching + hold.
2, two ways: through multiple concurrently executing threads, the OS switching + hold.
3, three way: Open concurrent coroutine execution, the control program with its own cpu switch between tasks + 3 hold.
These three implementations, coroutine Preferably, this is because:
1. coroutine switching overhead is smaller, program-level switching part of the operating system is completely imperceptible, and thus more lightweight
2. coroutine run faster
3. coroutine long-term occupation cpu will perform all the tasks I just inside the program.
Coroutine features:
- It must be implemented concurrently in only a single thread in
- Modify shared data without locking
- A plurality of user program context save their stacks of the control flow (holding state)
- Additional: a coroutine experience other coroutine IO operation automatically switches to
Greenlet
Greenlet python is one third-party modules, real coroutine module is used to complete the handover greenlet
Concurrent two core: holding the switching state and then we slowly introduced into the module from a usage example
# 版本一:单切换 def func1(): print('in func1') def func2(): print('in func2') func1() print('end') func2() # 版本二:切换+保持状态 import time def gen(): while 1: yield 1 time.sleep(0.5) # 手动设置IO,遇到IO无法自动切换 def func(): obj = gen() for i in range(10): next(obj) func() # 版本三:切换+保持状态,遇到IO自动切换 from greenlet import greenlet import time def eat(name): print('%s eat 1' %name) # 2 g2.switch('taibai') # 3 time.sleep(3) print('%s eat 2' %name) # 6 g2.switch() # 7 def play(name): print('%s play 1' %name) # 4 g1.switch() # 5 print('%s play 2' %name) # 8 g1=greenlet(eat) g2=greenlet(play) g1.switch('taibai') # 1 切换到eat任务
Coroutine module gevent
gevent is a third-party libraries can be easily achieved by gevent concurrent synchronous or asynchronous programming, the main mode is used in gevent greenlet , it is a form of access Python C extension module lightweight coroutines. Greenlet all run inside the main operating system processes, but they are collaboratively scheduling.
Gevent usage of several modules:
# 用法: g1=gevent.spawn(func,1,2,3,x=4,y=5)#创建一个协程对象g1,spawn括号内第一个参数是函数名,如eat,后面可以有多个参数,可以是位置实参或关键字实参,都是传给函数eat的,spawn是异步提交任务 g2=gevent.spawn(func2) g1.join() # 等待g1结束 g2.join() # 等待g2结束 有人测试的时候会发现,不写第二个join也能执行g2,是的,协程帮你切换执行了,但是你会发现,如果g2里面的任务执行的时间长,但是不写join的话,就不会执行完等到g2剩下的任务了 # 或者上述两步合作一步:gevent.joinall([g1,g2])
Using a blocking time.sleep simulation program encountered:
import gevent import time from threading import current_thread def eat(name): print('%s eat 1' %name) print(current_thread().name) # gevent.sleep(2) time.sleep(2) print('%s eat 2' %name) def play(name): print('%s play 1' %name) print(current_thread().name) # gevent.sleep(1) # gevent.sleep(1)模拟的是gevent可以识别的io阻塞 time.sleep(1) # time.sleep(1)或其他的阻塞,gevent是不能直接识别的需要用下面一行代码,打补丁,就可以识别了 print('%s play 2' %name) g1 = gevent.spawn(eat,'egon') g2 = gevent.spawn(play,name='egon') print(f'主{current_thread().name}') g1.join() g2.join() # 结果: 主MainThread egon eat 1 MainThread egon eat 2 egon play 1 MainThread egon play 2
The final version:
import gevent from gevent import monkey monkey.patch_all() # 打补丁: 将下面的所有的任务的阻塞都打上标记 def eat(name): print('%s eat 1' %name) time.sleep(2) print('%s eat 2' %name) def play(name): print('%s play 1' %name) time.sleep(1) print('%s play 2' %name) g1 = gevent.spawn(eat,'egon') g2 = gevent.spawn(play,name='egon') # g1.join() # g2.join()#当多个协程需要等待我们可以使用下面一行代码 gevent.joinall([g1,g2]) # 结果: egon eat 1 egon play 1 egon play 2 egon eat 2
Load balancing: it refers to the load (tasks) to be balanced spread over a plurality of operating units operating
Nginx: Nginx is a lightweight Web server / reverse proxy server and e-mail (IMAP / POP3) proxy server, which is characterized by possession of less memory, high concurrency.
Summary: In general we are in the process of working + thread + coroutine way to achieve concurrency, concurrent to achieve the best results, if a 4-core cpu, generally from five processes, each of the 20 threads ( 5 times the number of cpu), each thread can play 500 coroutine, when large-scale crawling the page, wait for the network time delay when we can use to achieve concurrent coroutines. The number of concurrent = 5 * 20 * 500 = 50000 concurrent, which is generally a 4cpu the maximum number of concurrent machine. nginx maximum load when the load balancing is a 5w
In the single-threaded code for these 20 tasks usually calculated both operations have a blocking operation, we can go to perform the task 2 in task execution time experience blocking 1:00, on the use of obstruction. . . . So, in order to improve efficiency, which uses Gevent module.