python multi-threaded, multi-process, thread pool, pool process

https://blog.csdn.net/somezz/article/details/80963760

python multithreading

Threads (Thread), also known as lightweight processes, the operating system is the smallest unit capable of scheduling operations, it is inclusion in the process, the actual operation of the unit process. Thread they do not have the system resources, it has only a little in the operation of essential resources, but it can be shared with other threads process belong to a process that is owned by all of the resources. A thread can create and undo another thread can execute concurrently across multiple threads in the same process.
--------------------- 

Each thread has his own set of CPU registers, called the context of the thread, the thread context reflects the status of the last run of the thread CPU registers. There are threads ready, blocking, running three basic states.

  1. Ready state refers to may have threads running on all operating conditions, the logic in the processor waiting;
  2. Running state refers thread owns the processor is running;
  3. Blocking state means that the thread waits for an event (such as a semaphore), unenforceable logically.

--------------------- 

Speaking of multi-threading in Python, a topic around the past is the global lock GIL (Global interpreter lock). GIL is essentially a mutex, mutex since it is the essence of all mutex are the same, all become serial will run concurrently, in order to control the sharing of data within the same time can only be modified by a task , thus ensuring data security.

What we first clear thread executing the task, do the calculation (computationally intensive) or do the input-output (I / O-intensive), different scenarios to use different methods. Python suitable for use in multi-threaded I / O-intensive tasks. I / O-intensive tasks with less time on the CPU calculations, more time on I / O, such as file reading and writing, web request, the database requests the like; for compute-intensive tasks, should use multiple processes.

Using the threading module uses multiple threads

Python standard library module comes with two multi-threaded, respectively, threading and thread, which, thread is a lower module, threading is the packaging of the thread, generally, we can directly use threading.

Other methods threading module provides:

  • threading.currentThread (): Returns the current thread variable.
  • threading.enumerate (): Returns a list of running threads. Refers to the thread starts running, before the end, does not include a thread before starting and after termination.
  • threading.activeCount (): Returns the number of running threads, and len (threading.enumerate ()) have the same result.

In addition to using the method, the threading module also provides a Thread class processing thread, Thread class provides the following methods:

  • run (): to indicate the method of active threads.
  • start (): start thread activity.
  • join ([time]): Wait until the thread is suspended. This blocks the calling thread until the thread's join () method is called suspension - normal exit or throw an unhandled exception - or the optional timeout occurs.
  • isAlive (): Returns the thread is active.
  • getName (): Returns the thread name.
  • setName (): Set the thread name.

The following code sets up five thread calls say_hello function

import threading
import time

def say_hello():
    time.sleep(1)
    print("Hello world!")

def main():
    threads = []
    for i in range(5):
        thread = threading.Thread(target=say_hello)
        thread.start()
        threads.append(thread)
    for thread in threads:
        thread.join()
    print('hello')

main()

Export

Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
hello

jion () function block the main thread running. Do not use join () function, the thread execution order and downs, the main thread might execute before all the child thread executed.

import threading
import time

def say_hello():
    time.sleep(1)
    print("Hello world!")

def main():
    for i in range(5):
        thread = threading.Thread(target=say_hello)
        thread.start()     
    print('hello')

main()

Export

hello
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!

Comparison between the python multi-threaded and multi-process

Python suitable for use in multi-threaded I / O-intensive tasks. I / O-intensive tasks with less time on the CPU calculations, more time on I / O, such as file reading and writing, web request, the database requests the like; for compute-intensive tasks, should use multiple processes.

https://blog.csdn.net/somezz/article/details/80963760

Thread Synchronization of Lock (mutex):

If multiple threads of a common data modification, unpredictable results may appear, this time on the need to progress need to use mutex synchronization. Code shown below, the common thread after three variables 1,000,000 subtraction operation num, num result of which is not 0,

import time, threading

num = 0
lock = threading.Lock()
def task_thread(n):
    global num
    for i in range(1000000):
        num = num + n
        num = num - n

t1 = threading.Thread(target=task_thread, args=(6,))
t2 = threading.Thread(target=task_thread, args=(17,))
t3 = threading.Thread(target=task_thread, args=(11,))
t1.start(); t2.start(); t3.start()
t1.join(); t2.join(); t3.join()
print("except value is 0, real value is {}".format(num))

operation result

except value is 0, real value is 23

The reason is that there is no zero, because the need to modify num multiple statements, when a thread is executing num + n, another thread is executed num-m, so that when the value of the num num-n before executing the thread leads not the previous value, leading to the final result is not zero.
To ensure the accuracy of the data, you need to use a mutex to synchronize multiple threads to limit when a thread is accessing data, the other can only wait until the previous thread releases the lock. Lock object using threading.Thread Rlock and can achieve a simple thread synchronization, these two objects have acquire method and release method for the data that needs to allow only one thread operation, the operation can acquire and release into the between methods. as follows:

import time, threading

num = 0
lock = threading.Lock()
def task_thread(n):
    global num
    # 获取锁,用于线程同步
    lock.acquire()
    for i in range(1000000):
        num = num + n
        num = num - n
    #释放锁,开启下一个线程
    lock.release()

t1 = threading.Thread(target=task_thread, args=(6,))
t2 = threading.Thread(target=task_thread, args=(17,))
t3 = threading.Thread(target=task_thread, args=(11,))
t1.start(); t2.start(); t3.start()
t1.join(); t2.join(); t3.join()
print("except value is 0, real value is {}".format(num))
except value is 0, real value is 0

You can also use with lock:

import time, threading

num = 0
lock = threading.Lock()
def task_thread(n):
    global num
    with lock:
        for i in range(1000000):
            num = num + n
            num = num - n

t1 = threading.Thread(target=task_thread, args=(6,))
t2 = threading.Thread(target=task_thread, args=(17,))
t3 = threading.Thread(target=task_thread, args=(11,))
t1.start(); t2.start(); t3.start()
t1.join(); t2.join(); t3.join()
print("except value is 0, real value is {}".format(num))

Thread Pool

Outline

https://www.jianshu.com/p/afd9b3deb027

Traditional multi-threaded program will use the "created on the fly, instant destruction" strategy. Although compared to creating a process, thread creation time has been greatly shortened, but if the task is submitted to the thread execution time is shorter, and the number of executions very frequently, then the server will be in non-stop to create a thread, the thread destruction of the state.

Run time a thread can be divided into three parts: the destruction of time to start time for a thread, the thread running time body and threads. In a multithreaded processing scenario, if the thread can not be reused, it means that every time you create will need to go through to start, run and destroyed three processes. This will definitely increase the appropriate time system, reducing efficiency.

Use a thread pool:
As the thread is created in advance and placed in the thread pool, while processed after the current task is not to destroy but to be arranged with the next task, to create multiple threads can be avoided, thus saving the overhead of thread creation and destruction , can lead to better performance and system stability.

Use the thread pool

http://c.biancheng.net/view/2627.html

Base class thread pool is concurrent.futures module Executor, Executor provides two sub-categories, namely ThreadPoolExecutor and ProcessPoolExecutor, which ThreadPoolExecutor used to create a thread pool, and ProcessPoolExecutor process used to create the pool.

If you use the thread pool / pools to manage concurrent programming process, as long as the corresponding task will be submitted to the thread pool function / process pool, rest of the things to get the thread pool / process pool.

Exectuor provides the following common methods:

  • submit (fn, * args, ** kwargs): will be presented to the thread pool fn function. * Args parameter to fn represents the function, * kwargs representatives in the form of keyword arguments passed to fn function parameters.
  • map (func, * iterables, timeout = None, chunksize = 1): This function is similar to a global function map (func, * iterables), but the function will start a plurality of threads, the processing is executed asynchronously map immediately iterables.
  • shutdown (wait = True): close the thread pool.


The program will submit task function (submit) to the thread pool, submit method returns a Future object, Future class is primarily used to obtain a thread task function's return value. As the thread executes asynchronously task in the new thread, so thread executing the function of the equivalent of a "future perfect" job, so Python using Future to represent.

Future provides the following methods:

  • cancel (): cancel the Future represents the thread task. If the task is being performed, irrevocable, the method returns False; otherwise, the program will cancel the task and return True.
  • cancelled (): Returns the Future represents the thread whether the task was successfully canceled.
  • running (): If the Future represents the thread is executing the task, can not be canceled, the method returns True.
  • done (): If threaded tasks on behalf of the Funture are canceled or executed successfully completed, the method returns True.
  • result (timeout = None): Gets the Future represents the result of the last-threaded task returned. If the representative of the Future threaded task has not been completed, which will block the current thread, in which the blocking timeout parameter specifies a maximum number of seconds.
  • exception (timeout = None): Gets the Future represents the thread task was thrown. If the task is completed without exception, then the method returns None.
  • add_done_callback (fn): threaded tasks for the Future on behalf of a registered "callback function", when the task is completed successfully, the program will automatically trigger the fn function.


Once you've used a thread pool, you should call the shutdown of the thread pool () method, which will initiate a shutdown sequence of the thread pool. After calling shutdown () method thread pool is no longer receiving new tasks, but will have to submit all the previous task execution is completed. When all tasks are executed to complete the thread pool, all threads of the thread pool of death.

To use the thread pool threads to perform the following tasks:

  1. ThreadPoolExecutor class constructor calls to create a thread pool.
  2. Define a common thread function as a task.
  3. Call ThreadPoolExecutor object submit () method to submit threaded tasks.
  4. When you do not want to submit any task, calling ThreadPoolExecutor object shutdown () method to close the line
from concurrent.futures import ThreadPoolExecutor
import time

# 定义一个准备作为线程任务的函数
def action(max):
    time.sleep(2)
    return max
# 创建一个包含2条线程的线程池
pool = ThreadPoolExecutor(max_workers=2)
# 向线程池提交一个task, 50会作为action()函数的参数
future1 = pool.submit(action, 50)
# 向线程池再提交一个task, 100会作为action()函数的参数
future2 = pool.submit(action, 100)
# 关闭线程池
pool.shutdown()

Further, since the thread pool implementation management protocol context (Context Manage Protocol), and therefore, it can be used with the program statements to manage the thread pool, so avoiding manual close to the thread pool, as shown in the above procedure.

from concurrent.futures import ThreadPoolExecutor
import time

# 定义一个准备作为线程任务的函数
def action(max):
    time.sleep(2)
    return max
# 创建一个包含2条线程的线程池
with ThreadPoolExecutor(max_workers=2) as pool:
    # 向线程池提交一个task, 50会作为action()函数的参数
    future1 = pool.submit(action, 50)
    # 向线程池再提交一个task, 100会作为action()函数的参数
    future2 = pool.submit(action, 100)
    # 查看future1代表的任务返回的结果
    print(future1.result())
    # 查看future2代表的任务返回的结果
    print(future2.result())

 In addition, Exectuor also provides a map(func, *iterables, timeout=None, chunksize=1)method, a function which is similar to the global function map (), except that the thread pool map () method will start a thread for each element of iterables concurrent manner to perform the function func. This mode is equivalent to start len (iterables) threads, wells to collect the results of each thread.

For example, the following procedure is used Executor map () method to start a thread, and the thread collected task Return Value:

from concurrent.futures import ThreadPoolExecutor
import time

# 定义一个准备作为线程任务的函数
def action(max):
    time.sleep(2)
    return max
# 创建一个包含2条线程的线程池
with ThreadPoolExecutor(max_workers=2) as pool:
    # 向线程池提交4个task, (50,100,150,200)会作为action()函数的参数
    results = pool.map(action, (50,100,150,200))
    for result in results:
        # 查看future代表的任务返回的结果
        print(result)

Process pool

https://www.jianshu.com/p/53c2e732d974

concurrent.futuresLibrary provides a ProcessPoolExecutorclass, it can be used to perform computationally intensive functions in a separate Python interpreter.

from concurrent.futures import ProcessPoolExecutor
import time

def work(x):
    time.sleep(2)
    return result

# Parallel implementation
with ProcessPoolExecutor() as pool:
    results = pool.map(action, (50,100,150,200))
    for result in results:
        # 查看future代表的任务返回的结果
        print(result)  

 Like multi-threading that can be used pool.submit () to submit a single task manually.

python multi-threaded, multi-process applications

python multi-threaded download pictures

https://www.52pojie.cn/thread-912305-1-1.html?tdsourcetag=s_pctim_aiomsg

Python multiprocessing Pictures

https://blog.csdn.net/Alvin_FZW/article/details/82886004

Published 102 original articles · won praise 117 · views 330 000 +

Guess you like

Origin blog.csdn.net/pursuit_zhangyu/article/details/97925622