Million annual salary python road - Concurrent Programming coroutine

Coroutine

I. introduced coroutine

Topics in this section is based on single-threaded concurrency is achieved, that only one main thread (obviously available cpu only one) in the case of concurrency, for which we need to review the nature of concurrent follows: + switching state of preservation

  cpu running a task, perform other tasks around the cut (forced control by the switching operation system), a case that a blockage has occurred in the task both cases, the situation is the additional task of computing time is too long or there is a higher priority program replaces it

  Coroutine essentially on a thread, the thread before task switching is controlled by the operating system encountered I / O switch automatically, we now coroutine object is less overhead of OS switching (switching thread, created register, stack, etc., etc. to switch between them), in which our own program to control a task switch.

img

    ps: Introducing process theory, mention The three implementation status of the process, and the thread is the execution unit, so it can be understood on the map as a thread of three states

One: the second case in which the figure does not improve efficiency, just to let the rain descends cpu be able to achieve all tasks seem to be "simultaneous" implementation of the results, if multiple tasks are pure calculation, this switch but it will reduce efficiency. For this purpose we can verify based on yield. yield itself is a method you can save the task running state under a single thread, we simply go over:
#1 yiled可以保存状态,yield的状态保存与操作系统的保存线程状态很像,但是yield是代码级别控制的,更轻量级
#2 send可以把一个函数的结果传给另外一个函数,以此实现单线程内程序之间的切换  

Yield achieved by task switching + to protect the site:

import time

def func1():
    for i in range(11):
        yield
        print('这是我第%s次打印'%i)
        time.sleep(1)


def func2():
    g = func1()
    next(g)
    for k in range(10):
        print(f'呵呵,我已经打印{k}次')
        time.sleep(1)
        next(g)


# 不写yield, 下面两个任务是执行完func1里面所有的程序才会执行func2里面的程序,
# 有了yield,我们实现了两个任务的切换+保持状态
func1()
func2()

Comparison with the coprocessor serial calculation intensive processes:

# 串行
import time
def task1():
    res = 1
    for i in range(1,100000):
        res += i


def task2():
    res = 1
    for i in range(1,100000):
        res -= i

start_time = time.time()
task1()
task2()
print(time.time()-start_time)   # 0.011995553970336914
# yield版的协程
import time

def task():

    res = 1
    for i in range(1,100000):
        res += i
        yield res

def task2():
    g = task()
    res = 1
    for i in range(1,100000):
        res -= i
        next(g)

start = time.time()
task2()
print(time.time() - start)  # 0.0239870548248291

If the task is computationally intensive, faster than serial coroutine.

II: a first handover condition. In a mission encountered io circumstances, cut into two tasks to perform, so that you can take advantage of a blocked task to complete computing tasks II, improved efficiency in this.
import time
def func1():
    while True:
        print('func1')
        yield

def func2():
    g=func1()
    for i in range(10000000):
        next(g)
        time.sleep(3)
        print('func2')
start=time.time()
func2()
stop=time.time()
print(stop-start)


# yield不能检测IO阻塞,不能实现遇到IO自动切换.

Coroutine is to tell Cpython interpreter, you do not nb, not staged a GIL lock on it, well, I own mess for you to execute a thread, the thread switching time save you, I switched to switch than you much faster, avoid a lot of overhead for single-threaded, io operation we inevitably occurred in the program, but if we can in their own programs (ie the user program level, rather than the operating system level) under the control of a single thread io multiple tasks can encounter when blocking is switched to another task to compute in a task, thus ensuring that the thread is able to maximize in a ready state, that the state can always be executed cpu, equivalent to our users the program level to maximize their operating io hide, so the operating system can be confusing, let see: this thread seems to have been calculated, io is relatively small, so more will be allocated cpu execute permissions to our thread .

Coroutine is in the nature of single-threaded, controlled by the user's own task encounters a blocked io switched to another task to perform, in order to improve efficiency. To achieve it, we need to find a way solutions that meet the following criteria at the same time:

#1. 可以控制多个任务之间的切换,切换之前将任务的状态保存下来,以便重新运行时,可以基于暂停的位置继续执行。
#2. 作为1的补充:可以检测io操作,在遇到io操作的情况下才发生切换

II. Production Process Association

Coroutine: the concurrent single-threaded, also called micro-threads, shred. English Coroutine. What is described thread word: coroutine lightweight thread is a user-state, i.e. coroutine is controlled by the user program scheduling themselves.
Definition of coroutines: In single-threaded, controlled by the user's own task encounters a blocked io switched to another task to perform, in order to improve efficiency.

Coroutines and threads difference

When multi-task, thread switching system level much more than save and restore the CPU context from that simple. Efficient operating system for each thread running its own data cache Cache and so on, the operating system will help you do the data recovery operation. So switching threads huge performance. However, simply switching coroutine context of the operation of the CPU, so that a second switching a million times against the living systems.

  It is emphasized that:

#1. python的线程属于内核级别的,即由操作系统控制调度(如单线程遇到io或执行时间过长就会被迫交出cpu执行权限,切换其他线程运行)
#2. 单线程内开启协程,一旦遇到io,就会从应用程序级别(而非操作系统)控制切换,以此来提升效率(!!!非io操作的切换与效率无关)

  Comparative switching control thread operating system, the user controls the switching in the single-threaded coroutine

  Following advantages:

#1. 协程的切换开销更小,属于程序级别的切换,操作系统完全感知不到,因而更加轻量级
#2. 单线程内就可以实现并发的效果,最大限度地利用cpu

  Shortcomings are as follows:

#1. 协程的本质是单线程下,无法利用多核,可以是一个程序开启多个进程,每个进程内开启多个线程,每个线程内开启协程

  Summary coroutine features:

  1. It must be implemented concurrently in only a single thread in
  2. Modify shared data without locking
  3. Save your user program control flow stack multiple contexts

Three. Greenlet

If we have 20 tasks within a single thread, in order to achieve switching between multiple tasks, using the yield Builder way too cumbersome (you need to get initialized once the generator, and then call the send ... very troublesome ), while using very simple greenlet module 20 to achieve this task is switched directly

# 安装
一:
    pip3 install greenlet
二:
    PyCharm里settings-----> Project Interpreter-----> 右上角的加号+  ------> 搜索框里搜索
# 真正的协程模块就是使用greenlet完成的切换
from greenlet import greenlet

def eat(name):
    print(f"{name} eat 1")  # 2
    g2.switch('zdr')    # 3
    print(f'{name} eat 2')  # 6
    g2.switch() # 7


def play(name):
    print(f'{name} play 1') # 4
    g1.switch() # 5
    print(f'{name} play 2') # 8



g1 = greenlet(eat)
g2 = greenlet(play)

g1.switch('zcy')    # 可以在第一次switch时传入传输,以后就不需要 1

Simple switch (in the absence of io situation or did not repeat the operation to open up memory space), but will reduce program execution speed

Efficiency Comparison:

#顺序执行
import time
def f1():
    res=1
    for i in range(100000000):
        res+=i

def f2():
    res=1
    for i in range(100000000):
        res*=i

start=time.time()
f1()
f2()
stop=time.time()
print('run time is %s' %(stop-start)) #10.985628366470337

#切换
from greenlet import greenlet
import time
def f1():
    res=1
    for i in range(100000000):
        res+=i
        g2.switch()

def f2():
    res=1
    for i in range(100000000):
        res*=i
        g1.switch()

start=time.time()
g1=greenlet(f1)
g2=greenlet(f2)
g1.switch()
stop=time.time()
print('run time is %s' %(stop-start)) # 52.763017892837524

效率对比

greenlet only provides a more convenient way than the generator switch, when cut to the execution If you encounter a task io, blocking it in place, is still not solved the problems encountered IO automatically switches to improve efficiency.

  img

  Above this figure, it is the real meaning of coroutine, although there is no inherent avoid I / O time, but we use this time to do something else, usually in the process of work we are + thread + coroutine way to achieve concurrent, concurrent to achieve the best results, if the core 4 cpu, generally from 5 processes, each of threads 20 (five times the number of CPU), each thread can coroutine from 500, mass climb when taking the page, wait for the network time delay of time, we can use to achieve concurrent coroutines. The number of concurrent = 5 * 20 * 500 = 50000 concurrent, which is generally a 4cpu the maximum number of concurrent machine. nginx maximum load when the load balancing is a 5w

  In the single-threaded code for these 20 tasks usually calculated both operations have a blocking operation, we can go to perform the task 2 in task execution time experience blocking 1:00, on the use of obstruction. . . . So, in order to improve efficiency, which uses Gevent module

Four. Gevent Profile

# 安装
一:
    pip3 install gevent
二:
    PyCharm里settings-----> Project Interpreter-----> 右上角的加号+  ------> 搜索框里搜索

Gevent is a third-party library, you can easily implement synchronous or asynchronous concurrent programming by gevent, the main mode is used in gevent Greenlet , it is a form of access Python C extension module lightweight coroutines. Greenlet all run inside the main operating system processes, but they are collaboratively scheduling.

#用法
g1=gevent.spawn(func,1,2,3,x=4,y=5)创建一个协程对象g1,spawn括号内第一个参数是函数名,如eat,后面可以有多个参数,可以是位置实参或关键字实参,都是传给函数eat的,spawn是异步提交任务

g2=gevent.spawn(func2)

g1.join() #等待g1结束

g2.join() #等待g2结束  有人测试的时候会发现,不写第二个join也能执行g2,是的,协程帮你切换执行了,但是你会发现,如果g2里面的任务执行的时间长,但是不写join的话,就不会执行完等到g2剩下的任务了


#或者上述两步合作一步:gevent.joinall([g1,g2])

g1.value#拿到func1的返回值

It will automatically switch tasks encountered blocking IO

import gevent
def eat(name):
    print('%s eat 1' %name)
    gevent.sleep(2)
    print('%s eat 2' %name)

def play(name):
    print('%s play 1' %name)
    gevent.sleep(1)
    print('%s play 2' %name)


g1=gevent.spawn(eat,'egon')
g2=gevent.spawn(play,name='egon')
g1.join()
g2.join()
#或者gevent.joinall([g1,g2])
print('主')

遇到I/O切换

EXAMPLE upper gevent.sleep (2) is simulated gevent io recognizable obstruction,

  And time.sleep (2) or other obstruction, GEVENT is not directly recognize the need to use the following line, patch, can be identified

  Before monkey.patch_all () must be placed in front of those patched, such as time, socket module; from gevent import monkey

  Or we simply remember to: use gevent, you need to be from gevent import monkey; beginning monkey.patch_all () into file

from gevent import monkey;monkey.patch_all() #必须写在最上面,这句话后面的所有阻塞全部能够识别了

import gevent  #直接导入即可
import time
def eat():
    #print()  
    print('eat food 1')
    time.sleep(2)  #加上mokey就能够识别到time模块的sleep了
    print('eat food 2')

def play():
    print('play 1')
    time.sleep(1)  #来回切换,直到一个I/O的时间结束,这里都是我们个gevent做得,不再是控制不了的操作系统了。
    print('play 2')

g1=gevent.spawn(eat)
g2=gevent.spawn(play_phone)
gevent.joinall([g1,g2])
print('主')

We can use threading.current_thread (). GetName () to view each g1 and g2, view results DummyThread-n, that is false threads, virtual threads, in fact, in a thread inside

  Task switching process threads switch by the operating system itself, you can not control yourself

  Coroutine is to switch through their own program (codes), they can control, task switching is only met when the coroutine module can identify the IO operation, the program will achieve concurrent effect, if all programs are not IO operation, then basically the serial execution.

Five Gevent of synchronous and asynchronous

from gevent import spawn,joinall,monkey;monkey.patch_all()

import time
def task(pid):
    """
    Some non-deterministic task
    """
    time.sleep(0.5)
    print('Task %s done' % pid)


def synchronous():
    for i in range(10):
        task(i)

def asynchronous():
    g_l=[spawn(task,i) for i in range(10)]
    joinall(g_l)

if __name__ == '__main__':
    print('Synchronous:')
    synchronous()

    print('Asynchronous:')
    asynchronous()
#上面程序的重要部分是将task函数封装到Greenlet内部线程的gevent.spawn。 初始化的greenlet列表存放在数组threads中,此数组被传给gevent.joinall 函数,后者阻塞当前流程,并执行所有给定的greenlet。执行流程只会在 所有greenlet执行完后才会继续向下走。

协程:同步异步对比

Application examples of a six Gevent

from gevent import monkey;monkey.patch_all()
import gevent
import requests
import time

def get_page(url):
    print('GET: %s' %url)
    response=requests.get(url)
    if response.status_code == 200:
        print('%d bytes received from %s' %(len(response.text),url))


start_time=time.time()
gevent.joinall([
    gevent.spawn(get_page,'https://www.python.org/'),
    gevent.spawn(get_page,'https://www.yahoo.com/'),
    gevent.spawn(get_page,'https://github.com/'),
])
stop_time=time.time()
print('run time is %s' %(stop_time-start_time))

协程应用:爬虫

The above program plus the last section of the serial code to see efficiency: If your program does not require high efficiency, what things concurrent coroutines ah ah like it would not have.

print('--------------------------------')
s = time.time()
requests.get('https://www.python.org/')
requests.get('https://www.yahoo.com/')
requests.get('https://github.com/')
t = time.time()
print('串行时间>>',t-s)

Seven Gevent the application example (B)

  Gevent achieved by the concurrent single-threaded socket (from gevent import monkey; import must be placed before the socket module monkey.patch_all (), otherwise blocking the socket gevent unrecognized)

  img

  A plurality of network request time delay time which elapsed

Server:

from gevent import monkey;monkey.patch_all()
from socket import *
import gevent

#如果不想用money.patch_all()打补丁,可以用gevent自带的socket
# from gevent import socket
# s=socket.socket()

def server(server_ip,port):
    s=socket(AF_INET,SOCK_STREAM)
    s.setsockopt(SOL_SOCKET,SO_REUSEADDR,1)
    s.bind((server_ip,port))
    s.listen(5)
    while True:
        conn,addr=s.accept()
        gevent.spawn(talk,conn,addr)

def talk(conn,addr):
    try:
        while True:
            res=conn.recv(1024)
            print('client %s:%s msg: %s' %(addr[0],addr[1],res))
            conn.send(res.upper())
    except Exception as e:
        print(e)
    finally:
        conn.close()

if __name__ == '__main__':
    server('127.0.0.1',8080)

服务端

Client:

from socket import *

client=socket(AF_INET,SOCK_STREAM)
client.connect(('127.0.0.1',8080))


while True:
    msg=input('>>: ').strip()
    if not msg:continue

    client.send(msg.encode('utf-8'))
    msg=client.recv(1024)

Multiple multi-threaded client, to ask for the top end of the service is no problem

from threading import Thread
from socket import *
import threading

def client(server_ip,port):
    c=socket(AF_INET,SOCK_STREAM) #套接字对象一定要加到函数内,即局部名称空间内,放在函数外则被所有线程共享,则大家公用一个套接字对象,那么客户端端口永远一样了
    c.connect((server_ip,port))

    count=0
    while True:
        c.send(('%s say hello %s' %(threading.current_thread().getName(),count)).encode('utf-8'))
        msg=c.recv(1024)
        print(msg.decode('utf-8'))
        count+=1
if __name__ == '__main__':
    for i in range(500):
        t=Thread(target=client,args=('127.0.0.1',8080))
        t.start()

多线程并发多个客户端,去请求上面的服务端是没问题的

Guess you like

Origin www.cnblogs.com/zhangchaoyin/p/11420972.html