[Python] python processes, threads, coroutines, and when to use

 

Explanation

Process: the smallest unit is the operating system for resource allocation, resources including CPU, memory, disk, and other IO devices, etc.

Thread: is the basic unit of CPU scheduling.

Process: vector system for allocating resources, is an instance of a program running;

Thread: minimum unit of program execution is an entity in the process for executing a program, a plurality of process threads.

Why do some people say that Python Multithreading is tasteless?

 

In our common sense, multi-process, multi-threaded all the way through concurrent full use of hardware resources, improve operational efficiency program, how it has become tasteless in Python?

Since Python infamous GIL.

So what GIL yes? Why GIL? Multithreading is really sad it? GIL can get rid of it?

Multithreading is not the tasteless:

An experiment:

The number "100 million" diminishing, reduced to 0 program is terminated, if we use this task to a single-threaded execution, completion time would it be?

Single-threaded, 4 core CPU computer, a single-threaded spent time 6.5 seconds.

Multithreading create two sub-thread t1, t2, each thread execute 5 million times subtract two threads execute in a cooperative way is 6.8 seconds, but slower.

 

Logically speaking, two threads to run in parallel, time should be increased rather than decreased. The reason is that the GIL,

In Python multithreading, each thread of execution mode:

1. Get GIL

2. Run the code until sleep or python virtual machine to hang.

3. Release GIL

Visible, a thread wants to perform, you must first get GIL, we can GIL as a "pass", and in a python process, GIL only one. Can not get pass the thread was not allowed into the CPU.

 

 

In Cpython interpreter (Python mainstream language interpreter), there are a Global Interpreter Lock (Global Interpreter Lock), when the interpreter interpreted Python code, first get the lock, means that any time there can be only a thread code execution , other threads in order to get the CPU to execute code instructions, you must first obtain the lock if the lock is occupied by another thread, then the thread can only wait until the possession of the lock thread releases the lock only execution possible code instructions.

The same time, only one thread is running, other threads can only wait, even for multi-core the CPU , there is no way to let multiple threads "in parallel" to simultaneously execute code, can only be performed alternately , because it comes to on-line multi-threaded context switch, lock mechanism for dealing with (lock acquire and release locks, etc.), so multi-threaded execution unhappy anti slow.

GIL when it is released?

When a thread encounters I / O tasks, it will release the GIL. When computationally intensive (CPU-bound) 100 a thread of execution interpreter pedometer (ticks) (step count may be regarded as roughly Python virtual machine instruction) will release GIL.

By setting the length of a pedometer, the pedometer view length. Compared to single-threaded, which are mostly caused by overhead multithreading.

       CPython interpreter why this design? Multithreading there is a problem, how to solve synchronization, shared data consistency problem, because, for multiple threads to access shared data, there may be two threads simultaneously modify a data, if there is no proper mechanism to ensure data consistency, then proceeding leading to abnormal, therefore, the father of Python will engage in a global thread-locking, whether you have no data synchronization issues, anyway, across the board, on a global lock to ensure data security. This is a multi-threaded reason tasteless, because it does not control the security of the fine-grained data, but in a simple and crude way to solve.

      Such a solution in the 1990s, in fact, be no problem, after all, when the hardware configuration is very simple, single-core CPU is still the mainstream, multi-threaded application scenarios are not many, most of the time or in single-threaded mode run, do not involve a single thread context switching thread, but a higher efficiency than the multi-threaded (in multi-core environment, this rule does not apply).

      Therefore, the use GIL way to ensure the consistency and security of data, not necessarily desirable, at least at that time is a very low cost of implementation. So get rid of the GIL feasible? Some people really do that much, but the results were disappointing, in 1999 Greg Stein and Mark Hammond creates a two buddies get rid of Python GIL branch, on all variable data structure is replaced with the more fine-grained GIL lock. However, after the benchmark done, remove the GIL Python under conditions threaded efficiency nearly twice as slow.

Python's father, said: Based on the above considerations, to remove the GIL is not much value without having to spend too much energy. Summary CPython interpreter provides GIL guarantee thread data synchronized, with GIL, we also need to thread synchronization it? Multithreading in IO-intensive tasks, the performance and how is it? Welcome to leave a message.

 

python multithreading in the end there is no use?

 

 

1, CPU intensive code (various loop processing, counting, etc.), in this case, the count ticks will soon reach the threshold, and then triggers the release of further competitive GIL (multiple threads of course, need to consume toggle resources), so multiple threads in python is not friendly to CPU-intensive code.
2, IO intensive code (document processing, web crawler etc.), multi-threading can effectively improve the efficiency (IO operations will be IO have to wait for a single thread, resulting in unnecessary waste of time, and can open multiple threads waiting thread A automatically switching to the thread B, without wasting CPU resources, which can improve the efficiency of program execution). So python multi-threading more friendly for IO-intensive code.

In python3.x in, GIL does not use ticks count, instead using a timer (after the execution time threshold is reached, the current thread releases the GIL), so CPU-intensive programs more friendly, but still does not address the same time caused only GIL It can issue a thread of execution, so the efficiency is still unsatisfactory.

Multicore, multithreaded worse than single-core multi-threaded, multi-threaded Nucleation is a single reason, each release GIL, wake up that thread can acquire the GIL lock, can be seamlessly executed, but in multi-core, CPU0 released after GIL, on the other CPU thread will compete, but could immediately be GIL get CPU0, resulting in several other threads on the CPU will be awake waiting to be awakened after the switching time to be scheduled into the state, this will cause the thread thrashing (thrashing), resulting in lower efficiency

Back to the beginning of the problem: We often hear veteran said: "python next want to take full advantage of multi-core CPU, to use multi-process" because what is it?
The reason: each process has its own independent GIL, without disturbing each other, so that you can execute in parallel in the true sense, so in python, the efficiency is better than multi-process multi-thread (only for multi-core CPU terms).

So here say Conclusions: multi-core, parallel wanted to improve efficiency, more common approach is to use a multi-process, can effectively improve the efficiency

 

Anyway, CPU-intensive program in python to do, in itself inappropriate. With C, Go, Java faster than, really I can not say to the poor performance. Of course you can write C extensions to achieve true multi-threaded, use python to call, so the speed is fast. The reason we use python to do, simply because the development of ultra-high efficiency, can be implemented quickly.

Finally, add a few points:

  1. python in order to make good use of CPU, or use multiple processes to do it. Alternatively, a coroutine. multiprocessing and gevent calling you.
  2. GIL is not a bug, Guido is not limited only to leave the level of such a thing. Turtle t once said, do not try to do GIL and thread-safe in other ways, the result python language has dropped overall efficiency has doubled, on balance, GIL is the best choice - can not afford not to, but deliberately left significant.
  3. Python want to calculate speed up, you do not want to write C? Pypy with it, this is the real big kill.

 

With multi-process or coroutine improve concurrency without suitable for multi-threaded?

First, the multi-process better able to take advantage of multi-core CPU.

    However, multi-process also has its own limitations: bulky compared to more threads, switching takes longer, and in multi-process python, the number of processes does not exceed the recommended number of CPU cores (a process has only one GIL, so a process can only run a full CPU), can take full advantage of the machine's performance because when a process is using a CPU, but the process will be more frequent switching process, but worth the candle.

So multicore circumstances, considering the number of threads and cores of the same CPU multi-threading, the ability to take full advantage of multi-core CPU.

 

Second, when you need coroutine?

However, under special circumstances (especially IO-intensive tasks), multi-threading is easy to use than multiple processes.

For example: 200W to your article url, you need to crawl the page corresponding to each url saved, this time, only the use of multiple processes, the effect is certainly very bad. why?

For example, each request waiting time is 2 seconds, as follows (cpu ignore calculation time):

1, single-process single-threaded + : take 2 seconds * 200W = 400W 1111.11 hours seconds == == 46.3 days, this rate is obviously unacceptable

2, a single multi-threaded process : for example, we are in the process of opening more than 10 threads than 1 can be increased by 10 times the speed, which is about 4.63 days to complete the 200W bar crawl, please note that the actual implementation here is : thread 1 met obstruction, CPU switch to 2 threads to execute, then switched to meet blocking thread 3, and so, after 10 threads are blocked, this process will blocked, but blocking a thread until after the completion of this process in order to continue, so can improve the speed to about 10 times (here ignoring the thread switching overhead caused, in fact, should not be lifting up to 10 times), but need to consider is also a thread switching overhead, so You can not start unlimited multi-threaded (thread open 200W certainly do not fly)

3, multi-threaded multi-process : here on the powerful, in general, there are a lot of people use this method, multi-process, each process can account for a cpu, and multi-threaded bypass certain extent blocked waiting, so than under a single multi-threaded process and make the better, for example, we opened 10 processes, each thread in the opening 20W, speed of execution is theoretically open thread 200W more than 10 times faster than a single-process (Why is more than 10 times instead of 10 times, mainly to switch 200W cpu threads are certainly higher than consumption 20W switching threads process is much greater, considering that this part of the overhead, so more than 10 times).

Is there a better way? The answer is yes, it is this:

4, coroutine, before using it we first talk about what / why / how (what is it / Why using it / how to use it)
 

what:

Coroutine is a lightweight user-level threads. Coroutine has its own stack and register context. When coroutine scheduled handover, the context save registers and stack to another location, when cut back, context restore a previously saved registers and stack. therefore:

Coroutine retain the state when a call (i.e., a particular combination of all of the local state), during each reentrant, the equivalent of entering a call state, another way: once logic which exits into the the position of the stream.

In concurrent programming coroutine thread Similarly, each represents a coroutine execution unit has its own local data, shared global data and other resources with other coroutine.

why:

Basically the current mainstream language chosen as concurrent multi-threading facilities, related to the concept of threads is preemptive multitasking (Preemptive multitasking), and is associated with the coroutine cooperative multitasking.

Whether process or thread, each blocking switching needs into the system call (system call), and let the CPU to run the operating system scheduler, the scheduler would then decide which one process (thread) that run.
And because preemptive scheduling the execution order can not determine the characteristics of the synchronization issues need to be addressed very careful when using threads, coroutines completely there is no problem (and asynchronous event-driven programs have the same advantages).

Because coroutine user to write their own scheduling logic to the CPU, coroutines fact is single-threaded, so do not have to consider how CPU scheduling, context switching, which eliminates the need to switch CPU overhead, so some coroutine degree in sound and multi-threading.

how:

python inside how to use coroutines? The answer is to use gevent, use: Look here

Use coroutine, unrestricted thread overhead, I tried once to 20W on a single strip url coroutine in the process of execution, no problem.

So the most recommended method, multi-process + coroutines (can be seen in each process are single-threaded, single-threaded and this is coroutines of)

+ Coroutine multi-process, avoiding the overhead of CPU switch, but also be able to take full advantage of multiple CPU up this way for greater efficiency as well as the amount of data read and write the file to enhance the class of reptiles is huge.


Small example:


#-*- coding=utf-8 -*-

import requests

from multiprocessing import Process

import gevent

from gevent import monkey; monkey.patch_all()

 

import sys

reload(sys)

sys.setdefaultencoding('utf8')

def fetch(url):

    try:

        s = requests.Session()

        r = s.get(url,timeout=1)#在这里抓取页面

    except Exception,e:

        print e 

    return ''

 

def process_start(url_list):

    tasks = []

    for url in url_list:

        tasks.append(gevent.spawn(fetch,url))

    gevent.joinall(tasks)#使用协程来执行

 

def task_start(filepath,flag = 100000):#每10W条url启动一个进程

    with open(filepath,'r') as reader:#从给定的文件中读取url

        url = reader.readline().strip()

        url_list = []#这个list用于存放协程任务

        i = 0 #计数器,记录添加了多少个url到协程队列

        while url!='':

            i += 1

            url_list.append(url)#每次读取出url,将url添加到队列

            if i == flag:#一定数量的url就启动一个进程并执行

                p = Process(target=process_start,args=(url_list,))

                p.start()

                url_list = [] #重置url队列

                i = 0 #重置计数器

            url = reader.readline().strip()

        if url_list not []:#若退出循环后任务队列里还有url剩余

            p = Process(target=process_start,args=(url_list,))#把剩余的url全都放到最后这个进程来执行

            p.start()

  

if __name__ == '__main__':

    task_start('./testData.txt')#读取指定文件

 

Observant students will notice: the above example to hide a problem: the number of processes increases with the number of url while increasing process we do not use the pool multiprocessing.Pool to control the process a number of reasons and here are multiprocessing.Pool gevent conflict can not be used, but interested students can study gevent.pool this coroutine pool.
 

Reference: https://cloud.tencent.com/developer/news/218164 " Why do some people say Python Multithreading is tasteless? "

          https://www.cnblogs.com/anpengapple/p/6014480.html " Python multithreading in the end there is no use? "

        https://blog.csdn.net/lambert310/article/details/50605748 "to talk about the python's GIL, multi-threaded, multi-process"

 

 

Guess you like

Origin blog.csdn.net/bandaoyu/article/details/90583629