python coroutine

1. What is a coroutine

  Coroutine, also known as micro-thread, is a concept that realizes concurrency under a single thread. In one sentence, coroutine is a lightweight thread in user mode, which can be controlled and executed by user-defined programs. schedule.

  There are two things to note about this:

    1. The thread of python belongs to the kernel level, that is, the scheduling is controlled by the operating system (if a single thread encounters io or the execution time is too long, it will be forced to surrender the cpu execution authority and switch other threads to run)

    2. After the coroutine is opened in a single thread, once io is encountered, the switching will be controlled from the application level (not the operating system) to improve efficiency (the switching of non-io operations has nothing to do with efficiency)

  So using coroutines has the following advantages:

    1. The switching of the coroutine belongs to the switching of the program level, the overhead is smaller, the operating system is completely imperceptible, so it is more lightweight

    2. Concurrency can be achieved in a single thread, maximizing the use of CPU

  Its disadvantages are:

    1. The essence of a coroutine is a single thread and cannot use multiple cores. It can be a program to open multiple processes, each process to open multiple threads, and each thread to open a coroutine

    2. A coroutine refers to a single thread, so once a coroutine is blocked, the entire thread will be blocked

  Summarize the coroutine features:

    1. Concurrency can only be achieved in a single thread

    2. Modify shared data without locking

    3. The user program saves the context stack of multiple control flows by itself

    Additional: A coroutine automatically switches to other coroutines when it encounters an IO operation (how to implement detection IO, yield and greenlet cannot be implemented, so the gevent module (select mechanism) is used)

2. How to implement coroutines

  1. Using greenlets

      It needs to be installed before use: pip3 install greenlet

from greenlet import greenlet

def eat(name):
    print('%s eat 1' %name)
    g2.switch('egon')
    print('%s eat 2' %name)
    g2.switch()
def play(name):
    print('%s play 1' %name)
    g1.switch()
    print('%s play 2' %name)

g1=greenlet(eat)
g2=greenlet(play)

g1.switch( ' egon ' ) #You can pass parameters in the first switch, and you don't need them later

  Simple switching (in the absence of io or without repeating the operation of opening up memory space) will reduce the execution speed of the program

#Execute import time
 def f1() sequentially :

    res=1
    for i in range(100000000):
        res+=i

def f2():
    res=1
    for i in range(100000000):
        res*=i

start=time.time()
f1 ()
f2()
stop=time.time()
print('run time is %s' %(stop-start)) #10.985628366470337

#切换
from greenlet import greenlet
import time
def f1():
    res=1
    for i in range(100000000):
        res+=i
        g2.switch()

def f2():
    res=1
    for i in range(100000000):
        res*=i
        g1.switch()

start=time.time()
g1=greenlet(f1)
g2=greenlet(f2)
g1.switch()
stop=time.time()
print('run time is %s' %(stop-start)) # 52.763017892837524

  Greenlet only provides a convenient switching method. When switching to a task execution, if it encounters IO, it will block in place, but it cannot automatically switch to improve efficiency when encountering IO.

  The code of these 20 tasks in a single thread usually has both computing operations and blocking operations. We can completely block when executing task 1, and use the blocking time to execute task 2, which uses the Gevent module.

  2、Given

    First install Gevent: pip3 install gevent

    - Gevent is a third-party library that can easily implement concurrent synchronous or asynchronous programming through gevent. The main mode used in gevent is Greenlet , which is a lightweight coroutine that connects to Python in the form of a C extension module. Greenlets all run inside the main program operating system process, but they are scheduled cooperatively.

#Usage g1= gevent.spawn 
(func,1,,2,3,x=4,y=5 ) to create a coroutine object g1, the first parameter in spawn brackets is the function name, such as eat, there can be multiple A parameter, which can be a positional argument or a keyword argument, is passed to the function eat

g2 = vent.spawn(func2)

g1.join() #Wait for g1 to end 

g2.join() #Wait for g2 to end

#Or one step of the above two steps: gevent.joinall([g1,g2]) 

g1.value #Get the return value of func1

    - Automatically switch tasks when encountering IO blocking

import gevent
def eat(name):
    print('%s eat 1' %name)
    gevent.sleep(2)
    print('%s eat 2' %name)

def play(name):
    print('%s play 1' %name)
    gevent.sleep(1)
    print('%s play 2' %name)


g1=gevent.spawn(eat,'egon')
g2=gevent.spawn(play,name='egon')
g1.join()
g2.join()
#Or gevent.joinall([g1,g2]) 
print ( ' main ' )

    The above example gevent.sleep(2) simulates the io blocking that gevent can recognize,

    And time.sleep(2) or other blocking, gevent can not be directly recognized, you need to use the following line of code, patch, you can recognize

    from gevent import monkey;monkey.patch_all() must be placed in front of the patched person, such as time, before the socket module

    Or we simply remember: to use gevent, we need to put from gevent import monkey;monkey.patch_all() at the beginning of the file

from gevent import monkey;monkey.patch_all()

import gevent
import time
def eat():
    print('eat food 1')
    time.sleep(2)
    print('eat food 2')

def play():
    print('play 1')
    time.sleep(1)
    print('play 2')

g1=gevent.spawn(eat)
g2=gevent.spawn(play_phone)
vent.joinall([g1,g2])
print ( ' main ' )

  3. Synchronous and asynchronous of Gevent

from gevent import spawn,joinall,monkey;monkey.patch_all()

import time
def task(pid):
    """
    Some non-deterministic task
    """
    time.sleep(0.5)
    print('Task %s done' % pid)


def synchronous():
    for i in range(10):
        task(i)

def asynchronous():
    g_l=[spawn(task,i) for i in range(10)]
    joinall(g_l)

if __name__ == '__main__':
    print('Synchronous:')
    synchronous()

    print('Asynchronous:')
    asynchronous()
#The important part of the above program is to encapsulate the task function into the gevent.spawn of the Greenlet internal thread. The list of initialized greenlets is stored in the array threads, which is passed to the gevent.joinall function, which blocks the current process and executes all the given greenlets. The execution flow will only continue down after all greenlets have been executed.

   4. Coroutine application: crawler

from gevent import monkey;monkey.patch_all()
import gevent
import requests
import time

def get_page(url):
    print('GET: %s' %url)
    response=requests.get(url)
    if response.status_code == 200:
        print('%d bytes received from %s' %(len(response.text),url))


start_time=time.time()
vent.joinall([
    gevent.spawn(get_page,'https://www.python.org/'),
    gevent.spawn(get_page,'https://www.yahoo.com/'),
    gevent.spawn(get_page,'https://github.com/'),
])
stop_time=time.time()
print('run time is %s' %(stop_time-start_time))

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324921831&siteId=291194637