python multithreading, detailed tutorial, thread synchronization, thread locking, ThreadPoolExecutor

The use of python multithreading

Multithreading is mainly used for concurrent execution of a large number of IO operations to improve efficiency! Multi-process is mainly used for a large number of calculations, and then give full play to the performance of the CPU!

Here are mainly two types of multi-threading usage:

  1. threading.Thread
  2. concurrent.futures.ThreadPoolExecutor

To use the second method, you need to install it first (for python2.x)

pip install futures

basic use

Simple use of the first method: (Note: blank lines are deleted)

# coding=utf8

import requests
import threading
import concurrent
from concurrent.futures import ThreadPoolExecutor

def task():
    url = "https://www.baidu.com"
    res = requests.get(url)
    if res.status_code == 200:
        print "yes!"
    else:
        print "no!"

def main():
    t1 = threading.Thread(target=task) # Usage is very similar to Process class
    t2 = threading.Thread(target=task)

    t1.start()
    t2.start()
    t1.join()
    t2.join()

if __name__ == '__main__':
    main()

synchronization of multiple threads

Multi-threads can share the data of the main thread, and there may be different data problems during operation. The solution is to lock some codes and stipulate that only one thread can access them within a certain period of time until they are released!

Example: This will cause data to be out of sync

balance = 0

def data_operator(n):
    global balance # Indicates that in this function, balance is used globally
    balance += n
    balance -= n

def change_it(n):
    for item in xrange(0, 10 ** 6):
        data_operator(n)

def thread_synchronization():
    t1 = threading.Thread(target=change_it, args=(5,))
    t2 = threading.Thread(target=change_it, args=(8,))

    t1.start()
    t2.start()
    t1.join()
    t2.join()
    overall balance
    print balance
    # Because of the use of temporary variables when there are variable operations, when executed multiple times, the data content of the operation may be out of sync
    # So the value printed here is not necessarily 0

Locking realizes data synchronization to ensure that there is only one thread when operating key data

balance = 0
main_thread_lock = threading.Lock() # Here is a thread lock set in the main thread
# Acquire the lock main_thread_lock.acquire() This method or blocks the current thread until a lock is acquired. but does not block the main thread
# Release the lock main_thread_lock.release() At this time, other methods can apply for the use of the lock
# Usually try_catch_finally is used to make sure the lock is released

def data_operator(n):
    global balance # Indicates that in this function, balance is used globally
    balance += n
    balance -= n

def change_it(n):
    for item in xrange(0, 10 ** 6):
        with main_thread_lock: # Here is when the main data is operated, the lock is obtained, and the lock is released after the operation is performed. Here, the with statement is used for a more concise implementation. Be sure to pay attention to the release of the lock.
            data_operator(n)

Using ThreadPoolExecutor

When using ThreadPoolExecutor, for python2.x, you need to install pip install futures first, which is a package that comes with python3

Basic use:

import concurrent
from concurrent.futures import ThreadPoolExecutor

def task2(n):
    url = "https://www.baidu.com"
    res = requests.get(url)
    print n
    if res.status_code == 200:
        return "yes!"
    else:
        return "no!"

def multi_thread_by_thread_pool_executor():
    # max_workers maximum number of threads used
    with ThreadPoolExecutor(max_workers=5) as executor:
        task_list = [executor.submit(task2, x) for x in range(0, 10)]

        # For the processing of the result, as_completed returns a generator, which will return a task when each task is completed, the parameter is the sequence of futures
        # item represents the Future object
        # Use future.result to get the function result
        for item in concurrent.futures.as_completed(task_list):
            print item.result()

Not only has ThreadPoolExecutor for threads, but also ProcessPoolExecutor for processes, which are used in a similar way, except that the name is changed.

At the same time, due to the global interpreter lock, even if there are many threads, the CPU cannot run full, and the interpreter will be punished and reconstructed, and the GIL will be removed.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325836477&siteId=291194637