Python processes and threads, coroutines Analysis

Copyright: Shallow @ Copyright original article may not be reproduced without permission https://blog.csdn.net/xili2532/article/details/89882511

python process

Dispatch

In traditional computer operating system, the basic unit of CPU scheduling is the process. Later universal operating system introduces the concept of threads, thread is the basic unit of CPU scheduling, process only as a basic unit of resource owners.

parallel

Since the introduction of the thread,Originally a concurrent process can only have one, now a process can have multiple threads execute in parallel. Many of the early HTTP server are solved by the concurrent server threads, compared with a fork until the child process to handle concurrent efficiency has improved several times. All this can be achieved thanks to the concurrent threads at lower process costs.

definition

Process is a computer program execution. That is, each time code execution, a process that is itself first.
A process has: ready, running, interruption, dead, end and other states (different operating systems are not the same).

characteristic

1. Each program itself first and foremost a process
2. Run each process has its own address space, memory, data stack and other resources.
3. The operating system itself automatically manages all the processes (the code does not require user interference), and for the rational allocation of these processes can be executed time.
4. The process can be performed other tasks by deriving new processes, but each process has its own memory or data stack and so on.
The inter-process communication can (and messaging data), using the interprocess communication (IPC) method.

Explanation

1. The plurality of different processes can run on a CPU, a non-interfering
2. on the same CPU, you can run multiple processes, automatically assigned by the operating system time slice
3.Since resources can not be shared between processes, inter-process communication needs to send data, receive messages, etc.
4.Multi-process, also known as "parallel."

Unix / Linux operating system provides a fork () system call, it is very special. Normal function call, called once again returned, but fork () call once, twice returned because the operating system will automatically put the current process (known as the parent) a copy (called sub-process), then, respectively, The parent and the child returns inside.

The child never returns 0, and returns to the parent process ID of the child process . The reason for this is that a parent can fork out a lot of the child, so the parent process ID of each child to write down the process, but the child just call getppid () can get the ID of the parent process .

Python os module package of common system calls, including the fork, you can easily create a child process in Python programs

print(f"开始的进程是{os.getpid()}")
pid = os.fork()

if pid == 0 :
    print(f"我是子进程{os.getpid()},我的父进程是{os.getppid()}")
else:
    print(f"我是父进程{os.getpid()},我创建了子进程是{pid}")

operation result:

9443 is the beginning of the process
I is the parent process 9443, I created a sub-process is 9444
I was a child process 9444, my parent process is 9443

Because python is cross-platform, so the above is that while it (though rarely written in Windows) in windows in Linux? It ismultiprocessingThis module, multiprocessing module provides aProcess categoryTo represent a process object, the following example is to start a child process and waits for it to end.


def run_child(name):
    print(f"子进程的名字是{name},进程的pid是{os.getpid()}")

if __name__ == "__main__":
    print(f"父进程是{os.getpid()}")
    p = Process(target=run_child, args=('test',))
    print('子进程即将开始')
    p.start()
    p.join()
    print("子进程即将结束")

operation result:

9409 is the parent process
child process is about to begin
the child's name is test, pid process is the 9410
child process coming to an end

python thread

definition

Thread, the code is executed in the process.
The next process can run multiple threads, the operating system resource sharing application in the main process between these threads.
When starting multiple threads in a process, each thread executed in the order. Now operating systems, also supports thread preemption, that other threads waiting to run, by priority, signals, etc., will be running thread hangs himself run first.
Multiple threads can be executed simultaneously, multi-threaded and multi-process implementation is the same, but also by the operating system to quickly switch between multiple threads, so that each thread alternately short run, it looks like the same execution simultaneously. Of course, at the same time truly need to perform multi-threaded multi-core CPU was possible.

use

1. The user-written program comprising threads (each program itself is a process)
2. Operating System "handover procedure" into the current process
3. The current process includes the thread, the thread start
plurality of threads 4, the order of execution unless seize
5. multithreading, also known as "concurrent" execution.

characteristic

1. thread, you must start the process in the presence of a running
system resources 2. Thread obtained using a process, not as resources such as CPU need to apply the process
3. Thread can not be given a fair execution time, it can be preempted by other threads, and set the process in accordance with the distribution of the operating system execution time of
4 per process, you can start a lot of threads

Multitasking can be done by multiple processes, it can also be done by multiple threads within a process. Moreover, the process is composed of several threads, a process has at least one thread.
Python's standard library provides two modules: _thread and threading, _thread is a lower module, threading is an advanced module for _thread is encapsulated. In most cases, we only need to use this advanced threading module.

Start a thread that is passed to a function and create Thread instance and then call start () started

# 新线程执行的代码:
def loop():
    print(f"子线程{threading.current_thread().name}正在执行")
    n = 0
    while n < 5:
        n = n + 1
        print(f"子线程{threading.current_thread().name}>>>{n}")
        time.sleep(1)
    print(f"子线程{threading.current_thread().name}结束了")

print(f"线程{threading.current_thread().name}正在运行啊")
t = threading.Thread(target=loop)
t.start()
t.join()
print(f"线程{threading.current_thread().name}结束")

operation result:

Ah MainThread thread running
child thread Thread-1 is executing
a child thread Thread-1 >>> 1
child thread Thread-1 >>> 2
child thread Thread-1 >>> 3
child thread Thread-1 >>> 4
sub-thread thread-1 >>> 5
child thread thread-1 end of the
thread end MainThread

Since any process by default will start a thread, the thread we called the main thread, the main thread and can start a new thread, Python's threading module has a current_thread () function, which always returns an instance of the current thread. The name of the main thread instance called MainThread, the name of the child thread is specified at creation time, if the name can not afford Python automatically to the thread named Thread-1, Thread-2.

The difference between processes and threads

A process in the main process with the respective threads share the same resources, as compared to inter-process independent of each other, sharing of information and communications among threads easier (in the process, and shared memory, etc.).

Generally concurrent execution threads, precisely because of this concurrency and data sharing mechanisms to enable collaboration between multiple tasks possible.

Generally parallel execution process, which enables the parallel program can run simultaneously on a plurality of the CPU;

Different from running multiple threads within a process can only apply to a "time slice" (a process within the CPU, launched multiple threads, thread scheduling, shared executable time slice of this process), the process can realize the program "simultaneously" run (multiple CPU running simultaneously).

Common scenarios process and thread

In general, the experience of concurrent programs written in Python:

Compute-intensive tasks using multi-process
IO intensive (such as: network communication) tasks using multi-threaded, multi-process use less.
This is because the IO operation requires exclusive resources, such as:

Network communications (only one person to speak on the micro, the macro looks like while chatting) Only one person speak
file read and write at the same time there can be only one program operation (if two programs at the same time a file is written to the same ' a ',' b ', then the file is written in the end which it?)
needs to control resources only one program in use in multiple threads, IO resources by the host application process, multi-threaded execution one by one, even if preemption , and also by one run, feeling the "multi-threaded" concurrently executed.

If multiple processes, unless the end of a process, or a completely can not, obviously multiple processes on the "waste" of resources.

problem:

All Python programs we wrote earlier, the process is to perform a single task, that is, only one thread. If we want to perform multiple tasks at the same time how to do?
There are two solutions:
1. One is to start multiple processes, each process although only one thread, but a number of processes can perform multiple tasks.
2. Another way is to start a process to start multiple threads within a process, so that multiple threads can also perform a number of tasks.
3. Of course, there is a third way is to start multiple processes, each process and then start multiple threads, tasks performed at the same time even more so, of course, more complicated this model, actual rarely used.

Boils down to this, multi-task in three ways:

Multi-process model;
multi-threaded mode;
multi-process multi-threaded mode.

Compute-intensive vs. IO-intensive

Whether to adopt multi-tasking second consideration is the type of task. We can put the task intoIO-intensive and compute-intensive.

Compute-intensive tasks to be characterized by a large number of calculations, consumption of CPU resources, such as computing pi, high-definition video decoding and so on, thanks to the computing power of the CPU. Although this computationally intensive tasks can also be accomplished with multitasking, but the more tasks, the more time spent on the task switching, the lower the efficiency of the CPU to perform tasks, therefore, to be most efficient use of CPU, compute-intensive the number of tasks at the same time should be equal to the number of CPU cores.

Since the main compute-intensive tasks consume CPU resources, and therefore, the operating efficiency of the code is essential. Python scripting languages ​​to run inefficient, totally unsuitable for compute-intensive tasks. For compute-intensive tasks, it is best written in C language.

The second type is IO-intensive task, involving a network, disk IO tasks are IO-intensive tasks, the characteristics of this type of task is CPU consumption, and most of the time a task is waiting for IO operation is complete (because IO is much slower than the CPU and memory speed). For IO intensive tasks, the more the task, the higher the CPU efficiency, but there is a limit. The most common tasks are IO-intensive tasks, such as Web applications.

IO-intensive tasks during execution, 99% of the time is spent on IO, very little time spent on the CPU, therefore, running extremely fast with C language was replaced with this low speed Python scripting language, completely unable to enhance operational efficiency. For IO intensive tasks, the most appropriate language is to develop the most efficient (least amount of code) languages, scripting languages ​​are preferred, C language worst.

Asynchronous IO

Considering the huge difference in speed between the CPU and IO, during the execution of a task most of the time waiting for IO operations, single-threaded process model will lead to other tasks can not be performed in parallel, therefore, we need a multi-process model or multi-threaded model to support concurrent execution of multiple tasks.

Modern operating systems on the IO operation has made a huge improvement, the biggest feature is support for asynchronous IO. If full use of asynchronous IO support the operating system, you can use single-threaded process model to perform multiple tasks, this new model called event-driven model,Nginx is to support asynchronous IO Web server, which uses a single process on a single core CPU model can efficiently support multi-tasking. On multi-core CPU, you can run multiple processes (the same number and the number of CPU cores), take full advantage of multi-core CPU. Since the total number of systems that process is very limited, so the operating system scheduler is very efficient. Asynchronous IO programming model to multi-task is a major trend.

Corresponds to the Python language, single-threaded asynchronous programming model called coroutine, with support for coroutine, you can write event-driven and efficient multi-tasking program

Guess you like

Origin blog.csdn.net/xili2532/article/details/89882511