27 Apr 18 GIL Multi-process and multi-thread usage scenarios Thread mutex and GIL comparison of socket communication based on multi-threading synchronization, asynchronous, blocking, non-blocking process pool and thread pool

27 Apr 18

1. Global Interpreter Lock (GIL)

The process of running test.py:

a. Read the code of the python interpreter from the hard disk into the memory

b. Read the code of test.py from the hard disk into the memory (two copies of the code are installed in one process)

c. Read the code in test.py like a string into the python interpreter for parsing and execution

1. GIL: Global Interpreter Lock (feature of the CPython interpreter)

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple

native threads from executing Python bytecodes at once. This lock is necessary mainly

because CPython's memory management (garbage collection mechanism, executed regularly by the interpreter) is not thread-safe (if it is not serially modified, a 10 is generated in the memory during the process of x=10, and x has not yet arrived and bound x, may be recycled by the garbage collection mechanism). However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

The essence of the GIL is a mutex (execution permission) clipped to the interpreter. All threads in the same process need to grab the GIL lock before executing the interpreter code

2. Advantages and disadvantages of GIL:

Pros: Guaranteed thread safety for Cpython interpreter memory management

Disadvantages: In the Cpython interpreter, only one thread can be executed at the same time for multi-threading under the same process, which means that the multi-threading of the Cpython interpreter cannot achieve parallelism and cannot take advantage of the multi-core advantage.

Notice:

a. The GIL cannot be parallelized, but it may be concurrent, not necessarily serial. Because serial is a task that is completely executed before proceeding to the next one; while in cpython, when a thread is in io, when it is released by the CPU, the permission to use the GIL will be forcibly revoked.

b. The advantage of multi-core (multi-CPU) is to improve computing efficiency

c. Computation-intensive -- "Use multiple processes to use multiple cores

d. IO-intensive -- "using multithreading

Second, the Cpython interpreter concurrency efficiency verification

1. Computationally intensive should use multiple processes

from multiprocessing import Process

from threading import Thread

import time

# import the

# print(os.cpu_count()) #View the number of CPUs

def task1():

    res=0

    for i in range(1,100000000):

        res+=i

def task2():

    res=0

    for i in range(1,100000000):

        res+=i

def task3():

    res=0

    for i in range(1,100000000):

        res+=i

def task4():

    res=0

    for i in range(1,100000000):

        res+=i

if __name__ == '__main__':

    # p1=Process(target=task1)

    # p2=Process(target=task2)

    # p3=Process(target=task3)

    # p4=Process(target=task4)

    p1=Thread(target=task1)

    p2=Thread(target=task2)

    p3=Thread(target=task3)

    p4=Thread(target=task4)

    start_time=time.time()

    p1.start()

    p2.start()

    p3.start()

    p4.start()

    p1.join()

    p2.join()

    p3.join()

    p4.join()

    stop_time=time.time()

    print(stop_time - start_time)

2. IO-intensive should use multithreading

from multiprocessing import Process

from threading import Thread

import time

def task1():

    time.sleep(3)

def task2():

    time.sleep(3)

def task3():

    time.sleep(3)

def task4():

    time.sleep(3)

if __name__ == '__main__':

    # p1=Process(target=task1)

    # p2=Process(target=task2)

    # p3=Process(target=task3)

    # p4=Process(target=task4)

    # p1=Thread(target=task1)

    # p2=Thread(target=task2)

    # p3=Thread(target=task3)

    # p4=Thread(target=task4)

    # start_time=time.time()

    # p1.start()

    # p2.start()

    # p3.start()

    # p4.start()

    # p1.join()

    # p2.join()

    # p3.join()

    # p4.join()

    # stop_time=time.time()

    # print(stop_time - start_time) #3.138049364089966

    p_l=[]

    start_time=time.time()

    for i in range(500):

        p=Thread(target=task1)

        p_l.append(p)

        p.start()

    for p in p_l:

        p.join()

print(time.time() - start_time)

3. Comparison of thread mutex and GIL

The GIL can protect interpreter-level code (related to garbage collection mechanisms) but cannot protect other shared data (such as its own code). Therefore, in the program, you need to lock the data that needs to be protected by yourself.

from threading import Thread,Lock

import time

mutex=Lock()

count=0

def task():

    global count

    mutex.acquire()

    temp=count

    time.sleep(0.1)

    count=temp+1

    mutex.release()

if __name__ == '__main__':

    t_l=[]

    for i in range(2):

        t=Thread(target=task)

        t_l.append(t)

        t.start()

    for t in t_l:

        t.join()

    print('主',count)

Fourth, realize concurrent socket communication based on multi-threading

Server:

from socket import *

from threading import Thread

from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor

tpool=ThreadPoolExecutor(3) #Processes and threads cannot be infinite, import modules to limit the number of key processes and thread pools; the process thread pool encapsulates the functions of Process and Thread modules

def communicate(conn,client_addr):

    while True: # Communication loop

        try:

            data = conn.recv(1024)

            if not data: break

            conn.send(data.upper())

        except ConnectionResetError:

            break

    conn.close()

def server():

    server = socket (AF_INET, SOCK_STREAM)

    server.bind(('127.0.0.1',8080))

    server.listen(5)

    while True: # chain loop

        conn,client_addr=server.accept()

        print(client_addr)

        # t=Thread(target=communicate,args=(conn,client_addr))

        # t.start()

        tpool.submit(communicate,conn,client_addr)

    server.close()

if __name__ == '__main__':

    server()

Client:

from socket import *

client = socket (AF_INET, SOCK_STREAM)

client.connect(('127.0.0.1',8080))

while True:

    msg=input('>>>: ').strip()

    if not msg:continue

    client.send(msg.encode('utf-8'))

    data=client.recv(1024)

    print(data.decode('utf-8'))

client.close()

Five, process pool and thread pool

Why use "pool": The pool is used to limit the number of concurrent tasks, and limit our computer to perform tasks concurrently within a range that we can afford

When to install processes in the pool: concurrent tasks are computationally intensive

When to install threads in the pool: concurrent tasks are IO-intensive

1. Process pool

from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor

import time,os,random

def task(x):

    print ('% s customer service'% os.getpid ())

    time.sleep (random.randint (2,5))

    return x**2

if __name__ == '__main__':

    p=ProcessPoolExecutor() # The number of processes opened by default is the number of cpu cores

    #alex, Wu Peiqi, Yang Li, Wu Chenyu, Zhang San

    for i in range(20):

        p.submit(task,i)

2. Thread pool

from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor

import time,os,random

def task(x):

    print('%s pick up' %x)

    time.sleep (random.randint (2,5))

    return x**2

if __name__ == '__main__':

    p=ThreadPoolExecutor(4) # The number of threads enabled by default is the number of cpu cores*5

    #alex, Wu Peiqi, Yang Li, Wu Chenyu, Zhang San

    for i in range(20):

        p.submit(task,i)

Six, synchronous, asynchronous, blocking, non-blocking

1. Blocking and non-blocking refer to the two operating states of the program

Blocking: Blocking occurs when IO is encountered. Once the program encounters a blocking operation, it will stop in place and release CPU resources immediately.

Non-blocking (ready state or running state): no IO operation is encountered, or by some means, the program will not stop in place even if it encounters an IO operation, perform other operations, and strive to occupy as much CPU as possible

2. Synchronous and asynchronous refer to two ways of submitting tasks:

Synchronous call: After submitting the task, just wait in place, and continue to execute the next line of code until the return value of the task is obtained after the task is completed.

Asynchronous call: After submitting the task, instead of waiting in place, execute the next line of code directly. Get the result after all executions are completed

from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor

import time,os,random

def task(x):

    print('%s pick up' %x)

    time.sleep (random.randint (1,3))

    return x**2

if __name__ == '__main__':

    # async call

    p=ThreadPoolExecutor(4) # The number of threads enabled by default is the number of cpu cores*5

    #alex, Wu Peiqi, Yang Li, Wu Chenyu, Zhang San

    obj_l=[]

    for i in range(10):

        obj=p.submit(task,i)

        obj_l.append(obj)

    # p.close()

    # p.join()

    p.shutdown(wait=True) (equivalent to p.close() (no new tasks are allowed to be put into the pool) + p.join())

    print(obj_l[3].result())

    print('main')

    # Synchronous call

    p=ThreadPoolExecutor(4) # The number of threads enabled by default is the number of cpu cores*5

    #alex, Wu Peiqi, Yang Li, Wu Chenyu, Zhang San

    for i in range(10):

        res=p.submit(task,i).result()

    print('main')

27 Apr 18 GIL Multi-process and multi-thread usage scenarios Thread mutex and GIL comparison of socket communication based on multi-threading synchronization, asynchronous, blocking, non-blocking process pool and thread pool

Guess you like