Python coroutine and its difference and connection with python multithreading

A coroutine, also known as a microthread or a fiber, is a user-level lightweight thread. 

    A coroutine has its own register context and stack. When the coroutine is scheduled to switch, save the register context and stack to other places, restore it when it is switched back, and continue working from the previously saved register context and stack. 

    In concurrent programming, coroutines are similar to threads. Each coroutine represents an execution unit, has its own local data, and shares global data and resource pools with other coroutines.

    The coroutine requires the operator to write the scheduling logic separately. For the CPU, the coroutine is also a single thread, so the CPU does not need to consider how to schedule and switch contexts, which saves the CPU overhead. Therefore, the coroutine is better than Multithreading.


Implement coroutines in python:

    Python uses yield to provide basic support for coroutines, but the third-party gevent library can better provide this service, and gevent has relatively complete coroutine support.

    geve is a coroutine-based python networking library that uses greenlets to provide a high-level concurrency API on top of the libev event loop.

    Features:

    (1) Fast event loop based on libev, epoll mechanism on Linux.

    (2) Lightweight execution unit based on greenlet.

    (3) The API reuses the content of the python standard library.

    (4) Collaborative sockets that support SSL.

    (5) DNS query can be realized through thread pool or c-ares.

    (6) The third-party module programming is collaborative through the monkey patching function.


    Gevent supports coroutines, in fact, it can be said that it is a work switch implemented by greenlet.

    The greenlet workflow is as follows: if the I/O operation accessing the network is blocked, the greenlet will explicitly switch to another code segment that is not blocked for execution until the original blocking state disappears, and then automatically switch to the original code segment. Continue processing. It can be said that greenlet is a more rational arrangement of serial work.

    At the same time, because the IO operation is time-consuming, the program is often in a waiting state. After gevent automatically switches the coroutine, it can ensure that there are always greenlets running without waiting for the IO to complete. This is the reason why the coroutine is more efficient than the general multithreading. .

    IO operations are done automatically, so gevent needs to modify some of python's own standard libraries to implement coroutine jumps for some common blocking, such as socket, select, etc. This process can be completed through monkey patch.

The following code can display the use process of gevent: (python version: 3.6 operating system environment: windows10)

from gevent import monkey
monkey.patch_all()
import guy
import urllib.request

def run_task(url):
    print("Visiting %s " % url)
    try:
        response = urllib.request.urlopen(url)
        url_data = response.read()
        print("%d bytes received from %s " % (len(url_data), url))
    except Exception as e:
        print (s)

if __name__ == "__main__":
    urls = ["https://stackoverflow.com/", "http://www.cnblogs.com/", "http://github.com/"]
    greenlets = [gevent.spawn(run_task, url) for url in urls]
    vent.joinall(greenlets)
Visiting https://stackoverflow.com/
Visiting http://www.cnblogs.com/
Visiting http://github.com/
46412 bytes received from http://www.cnblogs.com/
54540 bytes received from http://github.com/
251799 bytes received from https://stackoverflow.com/

    The spawn method of gevent can be regarded as used to form a coroutine, and the joinall method is equivalent to adding a coroutine task and starting the operation. It can be seen from the results that three network requests are executed concurrently, and the end order is inconsistent, but there is only one thread.

    gevent also provides pooling. If you have a dynamic number of greenlets that require concurrency management, you can use pools to handle a large number of network requests and IO operations.

    The following is the pool object of gevent, modify the above example of multiple network requests:

from gevent import monkey
monkey.patch_all()
from vent.pool import Pool
import urllib.request

def run_task(url):
    print("Visiting %s " % url)
    try:
        response = urllib.request.urlopen(url)
        url_data = response.read()
        print("%d bytes reveived from %s " %(len(url_data), url))
    except Exception as e:
        print (s)
    
    return ("%s read finished.." % url)

if __name__ == "__main__":
    pool = Pool (2)
    urls = ["https://stackoverflow.com/",
            "http://www.cnblogs.com/",
            "http://github.com/"]
    results = pool.map(run_task, urls)
    print(results)

Visiting https://stackoverflow.com/
Visiting http://www.cnblogs.com/
46416 bytes reveived from http://www.cnblogs.com/
Visiting http://github.com/
253375 bytes reveived from https://stackoverflow.com/
54540 bytes reveived from http://github.com/
['https://stackoverflow.com/ read finished..', 'http://www.cnblogs.com/ read finished..', 'http://github.com/ read finished..']
    From the results, the Pool object manages the concurrent number of coroutines. The first two are accessed first, and when one of the tasks is completed, the third request is continued.



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325862080&siteId=291194637