Python Advanced Tutorial Series: Python Processes and Threads

learning target

1. Understand the concept of multitasking

2. Understand the concept of process and the role of multi-process

3. Master the working principle and case writing of multi-process to complete multi-task

4. Master the method of obtaining the process number and the precautions for using the process

5. Understand the concept of threads and the role of multithreading

6. Master the working principle and case writing of multi-process to complete multi-task

1. The concept of multitasking

1. Take a chestnut

Thinking: When we use the network disk to download data, why do we need to download multiple tasks at the same time? A: Simultaneous execution of multiple tasks can greatly improve the execution efficiency of the program

2. Ask questions

Question: Using the techniques we have learned so far, can we multitask? Answer: No, because the programs written before are all single-task, that is to say, one function or method is executed before another function or method can be executed. To implement multiple tasks at the same time, you need to use multitasking. The biggest advantage of multitasking is to make full use of CPU resources and improve the execution efficiency of programs.

3. What is multitasking

Multitasking refers to performing multiple tasks at the same time. For example: The operating systems installed on computers are all multi-tasking operating systems, which can run multiple software at the same time.

4. Two manifestations of multitasking

① Concurrent ② Parallel

5. Concurrent operation

Concurrency: Alternately execute multiple tasks over a period of time. For example: for a single-core cpu to handle multitasking, the operating system allows each task to be executed alternately, such as: software 1 executes for 0.01 seconds, switches to software 2, software 2 executes for 0.01 seconds, then switches to software 3, executes for 0.01 seconds... like this Repeated execution continues, in fact each software is executed alternately. However, since the execution speed of the CPU is too fast, on the surface we feel as if these softwares are all executing at the same time. Here we need to pay attention to the fact that the single-core CPU executes multiple tasks concurrently.

6. Parallel operation

Parallelism: Executing multiple tasks together, truly simultaneously, over a period of time. For multi-core CPUs to handle multitasking, the operating system will arrange an execution task for each core of the CPU, and multiple cores are truly executing multiple tasks together at the same time. It should be noted here that the multi-core cpu executes multi-tasks in parallel, and there are always multiple tasks executed together.

2. The concept of process

1. How to implement multitasking in the program

In Python, multitasking can be achieved using multiprocessing.

2. The concept of process

Process (Process) is the smallest unit of resource allocation. It is the basic unit for resource allocation and scheduling operation of the operating system. Popular understanding: a running program is a process. For example: running qq, wechat, etc. They are all one process.

Note: After a program runs, there is at least one process

3. The role of multi-process

☆ Multi-process is not used

Thinking: The picture shows a very simple program. Once the hello.py program is run, according to the execution sequence of the code, the func_b function can only be executed after the func_a function is executed. If func_a and func_b can be run at the same time, it is obvious that hello.py is executed The efficiency of the program will be greatly improved.

☆ Multi-process is used

Three, multi-process to complete multi-task

1. Multi-process to complete multi-task

 
 
 
 

① Import process package
import multiprocessing

② Create process object through process class
Process object = multiprocessing.Process()

③ Start process to execute tasks
Process object.start()

2. Create a process object through the process class

 
 
 
 

Process object = multiprocessing.Process([group [, target=task name[, name]]])

Parameter Description:

parameter name

illustrate

target

The name of the target task to execute, here refers to the function name (method name)

name

Process name, generally do not need to be set

group

Process group, currently only None can be used

3. Code for process creation and startup

Type code while listening to music:

 
 
 
 

import multiprocessing
import time


def music():
for i in range(3):
print('听音乐...')
time.sleep(0.2)


def coding():
for i in range(3):
print('敲代码...')
time.sleep(0.2)

if __name__ == '__main__':
music_process = multiprocessing.Process(target=music)
coding_process = multiprocessing.Process(target=coding)

music_process.start()
coding_process.start()

operation result:

 
 
 
 

Listen to music...
knock code...
listen to music...
knock code...
listen to music...
knock code...

Here is a code that implements parallel tasks using the Python multiprocessing module. Two functions music() and coding() are defined in the code, respectively representing the two tasks of listening to music and typing codes. Each task will be executed 3 times, and it will pause for 0.2 seconds after each execution.

The code creates two processes music_process and coding_process through the multiprocessing.Process() method, and each process specifies a target parameter, which is the corresponding task music or coding.

Then the two processes are started through the start() method, and they will be executed asynchronously at the same time. if name == 'main': This line is the entry point of the Python program, which is used to ensure that the process will not be started repeatedly in a multi-process environment. After executing this code, you will see two processes executing their respective tasks in parallel, namely printing "Listen to music..." and "Knock code...".

4. The process executes the task with parameters

 
 
 
 

Process([group [, target [, name [, args [, kwargs]]]]])

Parameter description: This is a function in Python used to create a process object and start a new process. Its parameters include:

  • group: Specifies the process group, usually not needed, the default is None.

  • target: Specifies the function to be executed when the process starts. Must be a callable object.

  • name: The name of the specified process, usually not needed, the default is None.

  • args: Specifies the tuple of arguments passed to the target function. args=(1,2,'anne',)

  • kwargs: Specifies a dictionary of keyword arguments passed to the target function. kwargs={'name':'anne','age':18}

Case: the use of args parameters and kwargs parameters

 
 
 
 

import multiprocessing
import time

def music(num):
for i in range(num):
print('听音乐...')
time.sleep(0.2)

def coding(count):
for i in range(count):
print('敲代码...')
time.sleep(0.2)

if __name__ == '__main__':
music_process = multiprocessing.Process(target=music, args=(3,))
coding_process = multiprocessing.Process(target=coding, kwargs={'count': 3})

music_process.start()
coding_process.start()

music_process.join()
coding_process.join()

The join() method is added to ensure that the main process ends after all sub-processes are executed, otherwise the main process will end before the sub-processes are executed. In multi-process programming, the execution of the sub-process is asynchronous, and the main process and the sub-process are executed concurrently. If the main process ends without waiting for the sub-process to complete, the sub-process may be forcibly terminated, resulting in an abnormal program or data loss etc.

Therefore, after starting the sub-process, the join() method should be called to wait for the sub-process to finish executing before ending the main process to ensure the normal operation of the program. The join() method will block the main process, and will not continue to execute the main process until all sub-processes are executed.

Case: Multiple parameter passing

 
 
 
 

import multiprocessing
import time


def music(num, name):
for i in range(num):
print(name)
print('听音乐...')
time.sleep(0.2)


def coding(count):
for i in range(count):
print('敲代码...')
time.sleep(0.2)


if __name__ == '__main__':
music_process = multiprocessing.Process(target=music, args=(3, '多任务开始'))
coding_process = multiprocessing.Process(target=coding, kwargs={'count': 3})

music_process.start()
coding_process.start()

operation result:

 
 
 
 

Start multitasking
Listen to music...
Knock code...
Multitasking start
Listen to music...
Knock code...
Multitasking start
Listen to music...
Knock code...

This code implements an example of multiple processes executing different tasks concurrently. The specific implementation is as follows:

The multiprocessing and time modules are imported.

Two functions are defined: music() and coding(). The music() function outputs the string name and the "listen to music..." string through num loops and sleeps for 0.2 seconds, simulating the task of "listening to music". The coding() function outputs the string "Type code..." by looping count times and sleeps for 0.2 seconds, simulating the task of "Type code".

Execute two subprocesses in the main entry of the program: music_process and coding_process. Among them, the music_process process calls the music() function, passing in the parameters num=3 and name='multitasking start', and the coding_process process calls the coding() function, passing in the parameter count=3.

Start the music_process and coding_process processes.

The processes start to execute concurrently, among which the music_process process outputs num times of name and "listen to music..." strings, and the coding_process process outputs count times of "knock code..." strings. The program will sleep for 0.2 seconds after each output to simulate the execution of different tasks.

The program ends when the process finishes executing.

Fourth, get the process number

1. The role of the process number

When the number of processes in the program increases, if there is no way to distinguish the main process from the sub-processes and different sub-processes, then effective process management cannot be performed. In fact, each process has its own number for the convenience of management.

2. Two process numbers

① Get the current process number

 
 
 
 

getpid()

② Get the parent process ppid of the current process = parent pid

 
 
 
 

getppid()

3. Get the current process number

 
 
 
 

import os


def work():
# Get the number of the current process
print('work process number', os.getpid())
# Get the number of the parent process
print('work parent process number', os.getppid())


work ()

Case: Get the child process number

 
 
 
 

import multiprocessing
import time
import os


def music(num):
print('music>> %d' % os.getpid())
for i in range(num):
print('听音乐...')
time.sleep(0.2)


def coding(count):
print('coding>> %d' % os.getpid())
for i in range(count):
print('敲代码...')
time.sleep(0.2)


if __name__ == '__main__':
music_process = multiprocessing.Process(target=music, args=(3, ))
coding_process = multiprocessing.Process(target=coding, kwargs={'count': 3})

music_process.start()
coding_process.start()

operation result:

 
 
 
 

music>> 12232
listen to music...
coding>> 1868
type code...
listen to music...
type code...
listen to music...
type code...

This is an example of creating multiple processes using Python's multiprocessing module.

First, two functions are defined: music and coding, which output the process ID respectively, and then use the for loop and the time.sleep statement to simulate time-consuming operations. Secondly, in the main program of the program, use multiprocessing.Process to create two processes music_process and coding_process, respectively use the target parameter to specify the function to be executed, and the args or kwargs parameter to pass the parameters of the function to be executed. Then call the start method to start the two processes, thereby performing different tasks in the two processes at the same time.

It should be noted that using if name == 'main' is to prevent a series of problems caused by subprocesses recursively creating subprocesses, because under Windows systems, all Python programs will automatically run once, while under Linux/MacOS systems Otherwise it will not.

Case: Obtain the parent process and child process numbers

 
 
 
 

import multiprocessing
import time
import os


def music(num):
print('music>> %d' % os.getpid())
print('music主进程>> %d' % os.getppid())
for i in range(num):
print('听音乐...')
time.sleep(0.2)


def coding(count):
print('coding>> %d' % os.getpid())
print('music主进程>> %d' % os.getppid())
for i in range(count):
print('敲代码...')
time.sleep(0.2)


if __name__ == '__main__':
print('主进程>> %d' % os.getpid())
music_process = multiprocessing.Process(target=music, args=(3, ))
coding_process = multiprocessing.Process(target=coding, kwargs={'count': 3})

music_process.start()
coding_process.start()

operation result:

 
 
 
 

Main process >> 15080
music >> 10320
music Main process >> 15080
Listen to music...
coding >> 19220
music Main process >> 15080
Knock code...
Listen to music...
Knock code...
Listen to music...
Knock code...

This code uses Python's multiprocessing module to create two processes to simulate the simultaneous execution of the two tasks of listening to music and writing code.

First, two functions music() and coding() are defined to perform the tasks of listening to music and writing code in the process respectively. In the function, os.getpid() and os.getppid() are used to obtain the ID of the current process and the parent process.

The multiprocessing.Process() function is used in the code to create two processes. The music_process process executes the music() function, and the coding_process process executes the coding() function.

Finally, use the .start() method in the main process to start the execution of the two sub-processes. Observing the running results, we can find that the tasks of the music() and coding() functions are executed in parallel in the two sub-processes.

5. Notes on process application

1. Global variables are not shared between processes

In fact, creating a child process is to copy the resources of the main process to generate a new process, where the main process and the child process are independent of each other.

case:

 
 
 
 

import multiprocessing

my_list = []


def write_data():
for i in range(3):
my_list.append(i)
print('add:', i)
print(my_list)


def read_data():
print('read_data', my_list )


if __name__ == '__main__':
# Create a process for writing data
write_process = multiprocessing.Process(target=write_data)
# Create a process for reading data
read_process = multiprocessing.Process(target=read_data)

# Start the process to perform related tasks
write_process.start ()
time. sleep(1)
read_process. start()

Principle analysis:

The three processes respectively operate the global variable my_list in their own process, which will not affect the global variables in other processes, so global variables are not shared between processes, but the names of global variables between processes are the same, but The operation is not the global variable in the same process.

Summary of knowledge points: Creating a child process will copy the resources of the main process, that is to say, the child process is a copy of the main process, like a pair of twins. The reason why global variables are not shared between processes is because the operation is not the same The global variables in the process, but the names of the global variables in different processes are the same.

2. The end sequence of the main process and the child process

Code demo:

 
 
 
 

import multiprocessing
import time


# working function
def work():
for i in range(10):
print('Working...')
time.sleep(0.2)


if __name__ == '__main__':
# create child process
work_process = multiprocessing.Process(target=work)
# Start subprocess
work_process.start()

# Delay 1s
time.sleep(1)
print('The main process is finished')

Results of the:

This code uses Python's multiprocessing module to create and start a subprocess, and output a message after a delay of 1 second in the main process. The specific explanation is as follows:

  1. Import the multiprocessing and time modules.

  2. Define a function called work, which uses a for loop and the time.sleep() function to simulate some time-consuming work.

  3. In the if name == 'main': statement block, create a child process named work_process whose target function is work.

  4. To start a child process, call work_process.start().

  5. Use the time.sleep(1) function to delay for 1 second in the main process.

  6. Output a message indicating that the main process has finished executing.

Because the child process and the main process are executed in parallel, after the main process delays for 1 second, the child process is still executing the tasks in the worker function. Therefore, the output information may be before or after the output information of the child process, depending on the scheduling policy of the operating system

☆ Solution 1: Set up a daemon process

 
 
 
 

import multiprocessing
import time


# working function
def work():
for i in range(10):
print('Working...')
time.sleep(0.2)


if __name__ == '__main__':
# create subprocess
work_process = multiprocessing.Process(target=work)
# Set up the guardian main process. After the main process exits, the child process will be destroyed directly, and the code in the child process will no longer be executed.
work_process.daemon = True
# Start the child process
work_process.start()

# Delay 1s
time. sleep(1)
print('The main process is finished')

operation result:

 
 
 
 

Working...
Working...
Working...
Working...
Working...
The main process is finished

This code uses Python's multiprocessing module to create a subprocess and execute a work function in it.

First, a work function work() is defined, which will be executed 10 times in a loop, outputting "Working..." each time, and sleeping for 0.2 seconds.

In the main program, a subprocess work_process is first created, and the work function is passed in as its target parameter, indicating that the subprocess needs to execute the work function.

Then set the daemon attribute of work_process to True, indicating that the child process is a daemon process, that is, when the main process exits, the child process will also exit, and the code in the child process will no longer be executed.

Next, start the child process work_process, and the child process starts to execute the code in the work function. After the main process delays for 1 second, it outputs "The main process is executed" and then exits.

Since the child process is a daemon process, as the main process exits, the child process will also be destroyed, and the code in the work function will no longer be executed.

☆ Solution 2: Destroy the child process

 
 
 
 

import multiprocessing
import time


# working function
def work():
for i in range(10):
print('Working...')
time.sleep(0.2)


if __name__ == '__main__':
# create subprocess
work_process = multiprocessing.Process(target=work)
# Start sub-process
work_process.start()

# Delay 1s
time.sleep(1)

# Let the sub-process be destroyed directly, indicating the termination of execution. Before the main process exits, destroy all sub-processes directly OK
work_process.terminate()

print('The main process is finished')

Tip: The above two methods can ensure that the main process exits and the child process is destroyed

operation result:

 
 
 
 

Working...
Working...
Working...
Working...
Working...
The main process is finished

This code uses Python's multiprocessing module to create a subprocess, let the subprocess execute a work function work(), and control the operation and termination of the subprocess in the main process. The specific explanation is as follows:

  1. Import the multiprocessing and time modules.

  2. Define a work function work(), which will execute 10 times in a loop, print a message "Working..." each time, and sleep for 0.2 seconds.

  3. In the if name == 'main' conditional statement, create a child process work_process whose target function is work().

  4. Start the subprocess work_process, which will start to execute the loop statement in the work() function.

  5. The main process delays for 1 second, waiting for the child process to execute for a while.

  6. Call the work_process.terminate() method to forcibly terminate the execution of the child process.

  7. Print out the "main process execution completed" message.

Overall, this code shows how to use the multiprocessing module in Python to create and control the running and termination of subprocesses.

Six, the concept of thread

1. The concept of thread

In Python, you can also use multithreading to achieve multitasking.

2. Why use multithreading?

A process is the smallest unit for allocating resources. Once a process is created, certain resources will be allocated, just like opening two QQ software to chat with two people, which is a waste of resources.

A thread is the smallest unit of program execution. In fact, a process is only responsible for allocating resources, and it is the thread that uses these resources to execute the program. That is to say, the process is the container of the thread, and at least one thread in a process is responsible for executing the program. At the same time, the thread itself does not own system resources, and only needs a few resources that are essential during operation, but it can share all the resources owned by the process with other threads belonging to the same process. This is like opening two windows (two threads) to chat with two people through one QQ software (one process), realizing multitasking and saving resources at the same time.

3. The role of multithreading

 
 
 
 

def func_a():
print('Task A')


def func_b():
print('Task B')


func_a()
func_b()

☆ Single-threaded execution

☆ Multi-thread execution

Seven, multi-threading to complete multi-tasking

1. Multi-threading to complete multi-tasking

 
 
 
 

① Import thread module
import threading

② Create thread object through thread class Thread object
= threading.Thread(target=task name)

② Start thread to execute task
Thread object.start()

parameter name

illustrate

target

The name of the target task to execute, here refers to the function name (method name)

name

Thread name, generally do not need to be set

group

Thread group, currently only None can be used

2. Thread creation and startup code

Single thread case:

 
 
 
 

import time


def music():
for i in range(3):
print('听音乐...')
time.sleep(0.2)


def coding():
for i in range(3):
print('敲代码...')
time.sleep(0.2)


if __name__ == '__main__':
music()
coding()

Multi-threaded case:

 
 
 
 

import time
import threading


def music():
for i in range(3):
print('听音乐...')
time.sleep(0.2)


def coding():
for i in range(3):
print('敲代码...')
time.sleep(0.2)


if __name__ == '__main__':
music_thread = threading.Thread(target=music)
coding_thread = threading.Thread(target=coding)

music_thread.start()
coding_thread.start()

3. The thread executes the task with parameters

parameter name

illustrate

args

Pass parameters to the execution task in the form of tuples

kwargs

Pass parameters to the execution task as a dictionary

 
 
 
 

import time
import threading


def music(num):
for i in range(num):
print('听音乐...')
time.sleep(0.2)


def coding(count):
for i in range(count):
print('敲代码...')
time.sleep(0.2)


if __name__ == '__main__':
music_thread = threading.Thread(target=music, args=(3, ))
coding_thread = threading.Thread(target=coding, kwargs={'count': 3})

music_thread.start()
coding_thread.start()

In the main thread, start two threads through music_thread.start() and coding_thread.start() so that they can run at the same time. Since the threads are executed concurrently, the music() and coding() functions will be executed alternately, and corresponding information will be output.

4. The end sequence of the main thread and sub-threads

 
 
 
 

import time
import threading


def work():
for i in range(10):
print('work...')
time.sleep(0.2)


if __name__ == '__main__':
# create child process
work_thread = threading.Thread( target=work)
# Start thread
work_thread.start()

# Delay 1s
time.sleep(1)
print('The main thread is executed')

operation result:

 
 
 
 

work...
work...
work...
work...
work...
main thread finished
work...
work...
work...
work...
work...

This code uses Python's threading module to create a thread work_thread and execute the work() function.

The work() function outputs 'work...' and pauses for 0.2 seconds each time the loop counts to 10.

In the main thread, start the thread with work_thread.start() so that it can start executing.

Then, the main thread will delay for 1 second, and then output 'main thread execution completed'. Since the work_thread thread takes 2 seconds to complete all work, the main thread finishes executing before the work_thread thread.

By using threads, the program can be switched between multiple tasks to improve the efficiency and response speed of the program.

☆ Set daemon thread method 1

 
 
 
 

import time
import threading


def work():
for i in range(10):
print('work...')
time.sleep(0.2)


if __name__ == '__main__':
# Create a sub-thread and set the guardian main thread
work_thread = threading.Thread(target=work, daemon=True)
# Start the thread
work_thread.start()

# Delay 1s
time.sleep(1)
print('The main thread is executed')

operation result:

 
 
 
 

work...
work...
work...
work...
work...
the main thread is finished

This code uses Python's time and threading modules.

First, a function work() is defined to execute tasks in sub-threads. The function loops 10 times, outputting the string "work..." each time, and uses time.sleep(0.2) to simulate the execution time of the task.

Next, in the main program, a sub-thread work_thread is created, and the work() function is used as its target function, and daemon=True is set to set it as a daemon thread. Then, use work_thread.start() to start the child thread.

Then, the main thread uses time.sleep(1) to delay for 1 second to wait for the child thread to execute the task. Finally, the main thread outputs the string "main thread execution completed".

Since the child thread is set as a daemon thread, when the main thread finishes executing, the child thread will also end. Therefore, if no daemon thread is set, the child thread will continue to execute until the task is completed or manually stopped.

☆ Set daemon thread method 2

 
 
 
 

import time
import threading


def work():
for i in range(10):
print('work...')
time.sleep(0.2)


if __name__ == '__main__':
# create child thread
work_thread = threading.Thread( target=work)
# Set the guardian main thread
work_thread.setDaemon(True)
# Start the thread
work_thread.start()

# Delay 1s
time.sleep(1)
print('The main thread is executed')

operation result:

 
 
 
 

work...
work...
work...
work...
work...
the main thread is finished

This code uses Python's threading module to create a child thread, set it as the main thread of the guardian, and then start the child thread. The job of the child thread is to output the 'work...' string and sleep for 0.2 seconds, and repeat it 10 times.

In the main thread, the code outputs the string 'main thread execution completed' after a delay of 1 second. Because the child thread is set to guard the main thread, when the main thread finishes executing, the child thread will also end.

If the sub-thread is not set as the guardian main thread, after the main thread is executed, the sub-thread will continue to execute until all work is completed.

5. Execution order between threads

 
 
 
 

for i in range(5):
sub_thread = threading.Thread(target=task)
sub_thread.start()

Thinking: When we create multiple threads in the process, how are the threads executed? in order? Execute together? Or other implementation methods?

 

Answer: Execution between threads is out of order, verify

☆ Get current thread information

 
 
 
 

# Get the thread object through the current_thread method
current_thread = threading.current_thread()

# Through the current_thread object, you can know the relevant information of the thread, such as the order of creation
print(current_thread)

☆ Execution order between threads

 
 
 
 

import threading
import time

def get_info():
# You can not add it temporarily to see the effect
time.sleep(0.5)
current_thread = threading.current_thread()
print(current_thread)


if __name__ == '__main__':
# Create a sub-thread
for i in range(10):
sub_thread = threading.Thread(target=get_info)
sub_thread.start()

Summary: The execution between threads is out of order, and it is determined by CPU scheduling that a certain thread will be executed first.

6. Share global variables between threads

☆ Share global variables between threads

Multiple threads are in the same process, and the resources used by multiple threads are resources in the same process, so multi-threads share global variables

Sample code:

 
 
 
 

import threading
import time


my_list = []


def write_data():
for i in range(3):
print('add:', i)
my_list.append(i)
print(my_list)


def read_data():
print('read:', my_list)


if __name__ == '__main__':
write_thread = threading.Thread(target=write_data)
read_thread = threading.Thread(target=read_data)

write_thread.start()
time.sleep(1)
read_thread.start()

operation result:

 
 
 
 

add:0
add:1
add:2
[0, 1, 2]
read:[0, 1, 2]

This is an example of multithreaded programming using the threading module in Python.

The code defines two functions write_data() and read_data(), which implement adding data to an empty list my_list and reading data in the list respectively.

The code also defines two threads, write_thread and read_thread, which are used to perform the two tasks of writing data and reading data respectively. In the main function of the program, first point write_thread and read_thread to the write_data() and read_data() functions respectively, and call start() to start the write_thread thread, and wait for 1 second to start the read_thread thread.

When the program is executed, two threads, write_thread and read_thread, will run at the same time. write_thread will add data to the my_list list, and read_thread will read the my_list list and print the output. Since there is no synchronization between threads, race conditions may occur. That is, the order of writes and reads is indeterminate.

7. Summary: Process and thread comparison

☆ Relationship comparison

① Threads are attached to processes, and there are no threads without processes.

② A process provides one thread by default, and a process can create multiple threads.

☆ Difference and comparison

① Global variables are not shared between processes

② Share global variables between threads

③ The resource overhead of creating a process is greater than the resource overhead of creating a thread

④ Process is the basic unit of operating system resource allocation, thread is the basic unit of CPU scheduling

☆ Comparison of advantages and disadvantages

① Advantages and disadvantages of the process: Advantages: multi-core can be used Disadvantages: high resource overhead

② Advantages and disadvantages of threads Advantages: Minor resource overhead Disadvantages: Cannot use multi-core

Guess you like

Origin blog.csdn.net/Blue92120/article/details/131222357