"Process, thread, coroutine" programming three brothers of those things ~ you will understand after reading

Look at how to answer these questions:

  • What are processes and threads
  • What is the difference between process and thread
  • Why there are processes and threads
  • What is the difference between kernel mode and user mode
  • What are the characteristics of coroutines

A series of questions accompany each stage of study and work. These questions are really not easy to answer. Unless you really understand its underlying principles, it's easy to put yourself in.

So today, let’s take a look at how these problems arise and why they are always asked. Let’s get started.

Processes and threads

Process, usually we open a player, open a notepad, these are applications, an executive copy of the software, this is the process. From the operating system level, the process is the basic unit of allocating resources, and the thread has been called a lightweight process for a long time and is the basic unit of program execution .

In this way, one basic unit for allocating resources and one basic unit for program execution. In previous interviews, I often told the interviewer this way. When I became an interviewer, I found out why these kids had the answer. Most of the information on the Internet also said this. Rote memorization directly like this is certainly not good, let us return to the original computer era.

What was the original computer age like?

In that era, programmers put the written program into the flash memory, and then insert it into the machine. The chip calculates through the electric energy, then the chip takes the instruction from the flash memory and executes the next execution. Once the execution in the flash memory is completed , The computer is about to shut down

Flash era

In the early days, this was called a single-task model, also called a job (Job).

With more and more people’s needs and diversification of life, office, chat, games, etc. slowly appear. At this time, they have to switch back and forth in the same computer. People want to handle this without threads and processes. problem.

How is that handled?

For example, a game is a process after startup, but the presentation of a game scene requires graphics rendering and networking. These operations cannot block each other. If blocked, it will be very uncomfortable to get stuck. I always feel that the game is so low. We want them to run at the same time, so each part of it is designed as a thread, which means that a process has multiple threads.

Since a process has multiple threads, how to deal with this resource allocation?

To start a game, you first need to store these game parameters, so memory resources are needed. When performing actions such as attacks, the various action commands issued need to be calculated, so computing resources are required, CPU, and some files need to be stored, so file resources are also needed. Since the early OS did not have the concept of threads, each process was executed alternately using time-sharing technology, and each process was allowed to communicate through pipelines and other technologies.

This looks perfect. After starting a game, there are so many processes. After starting the game, can you arrange a technology under this process so that it only allocates CPU resources, and threads appear.

How is this thread allocated?

After the thread concept was proposed, it was also called a lightweight process because only CPU computing resources were allocated.

The thread is scheduled through the operating system, that is to say, after the operating system creates the process, it "leads a thread" and the entry program of the process is placed in the main thread. It seems that the operating system is scheduling the process, but in fact the process is scheduled Medium thread, this thread directly scheduled by the operating system is called a kernel-level thread.

More knowledge-[ 115 video 139 resource 59 source code 75 professional guidance] -free to receive learning

Kernel-level threads and user-level threads

Since there are kernel-level threads, of course there are user-level threads, which are equivalent to operating system scheduling threads. The main thread implements sub-threads through programs. This is the user-level thread, typically the Phread API in Linux. Now that we are talking about kernel mode and user mode, let's take a look at the functions of the two.

User mode thread

It is created entirely in user space and is unaware of the operating system. The advantages of user-level threads are as follows:

  • Low switching cost: the user space maintains it by itself, no need to use the operating system scheduling
  • Low management overhead: system calls are not used for creation and destruction, and context switching caused by system calls will be explained below

What are the disadvantages of user mode threads?

  • The cost of communicating with the kernel is high: Because this thread spends most of the time in user space, it is difficult to use the advantages of the kernel if IO operations are performed, and frequent user mode and kernel mode switching are required
  • Trouble with the collaboration between threads: imagine that two threads A and B need to communicate. The communication usually involves IO operations, and IO operations involve system calls. The system calls will have to switch costs between user mode and kernel suite.
  • The operating system cannot optimize thread scheduling: if a user-mode thread of a process is blocked, the operating system cannot detect and handle the blocking problem in time, it will not switch other threads and cause waste

Kernel mode thread

Kernel mode threads are executed in kernel mode . Generally, a kernel-level thread is created through system calls. What are the advantages?

  • Operating system-level optimization: even if the threads in the kernel perform IO operations, they do not need to make system calls. One kernel block allows others to execute immediately
  • Take full advantage of multi-core: the kernel authority is high enough to execute kernel threads on multiple CPU cores

What are the disadvantages of kernel-level threads?

  • The creation cost is relatively high: when creating, you need to use the system call to switch to the kernel mode
  • High switching cost: kernel operation is required when switching
  • Poor scalability: because of a kernel management, the number of pits is limited, and it is impossible to have too many

What is the mapping relationship between user mode threads and kernel mode threads ?

As mentioned above, both user-mode threads and kernel-mode threads have shortcomings. User-mode threads are cheap to create and cannot use multiple cores. Kernel-mode threads are expensive to create. Although multiple cores can be used, the switching speed is slow. Therefore, some threads are usually reserved in the kernel and used repeatedly. So far, the following mapping relationships have appeared

One of the mapping between user mode and kernel mode--many to one

Since the cost of creating kernel threads is high, we are multiple threads of multiple user-mode processes and multiplex a kernel-mode thread, but such threads cannot be concurrent, so this model has very few users.

 

 

User mode thread and kernel mode thread many-to-one

User mode and kernel mode mapping two-one to one

Let each user-mode thread allocate a separate kernel-mode thread, and each user-mode thread creates a bound kernel thread through system calls. This model can execute concurrently and make full use of the advantages of multi-core. The famous Windows NT uses this This model, but if there are more threads, too much pressure on the kernel

 

 

User mode thread and kernel mode thread one-to-one

User state and kernel state mapping three-many to many

That is, n user mode threads correspond to m kernel mode threads. m is usually less than or equal to n, and m is usually set to the number of cores. This many-to-many relationship reduces kernel threads and completes concurrency. This model is adopted by Linux

 

 

User mode thread and kernel mode thread many-to-one user mode thread and kernel mode thread many-to-many

A computer will start many processes, the number of which is of course greater than the number of CPUs, so we have to let the CPUs be allocated to them in turn, giving us the illusion of simultaneous execution of multiple tasks. Have you ever thought about what the CPU will do before these tasks are executed? ?

Since the CPU wants to execute it, it is bound to know where to load it and where to start running. In other words, the system needs to set up the CPU registers and program counter in advance .

What are the registers and program counters in your eyes?

It is small but powerful, and fast memory. The program counter is used to record the location of instructions being executed. The environment that these CPUs need to rely on is the context of the CPU. If the context is known, the CPU switching is easy to understand.

Save the CPU context of the previous task, load the context of the new task into the register and the program counter, and then jump to the location pointed to by the program counter. According to different tasks, it is divided into process context and thread context.

More knowledge-[ 115 video 139 resource 59 source code 75 professional guidance] -free to receive learning

When a process runs in user space, it is called user mode, and it falls into a kernel mode called process in kernel space. If a process in user mode wants to change to kernel mode, it can be done through system calls. The process is scheduled by the kernel, and the process switching occurs in the kernel mode

What data does the context of the process contain?

Since the process switching occurs in the kernel state, the context of the process includes not only user space resources such as virtual memory, stacks, and global variables, but also the state of the kernel space such as the kernel stack and registers.

The save context and restore context here are not free, it requires the kernel to run on the CPU to complete

 

 

Context save

Thread context switch

Seeing this, you can definitely blurt out the difference between the two is that the thread is the basic unit of scheduling, and the process is the basic unit of resource ownership. To put it bluntly, the task scheduling of the kernel actually schedules threads, and the process only provides resources such as virtual memory and global variables for the thread, so this understanding may be better:

  • If the process has only one thread, then the process is considered a thread
  • If the process has multiple threads, then multiple threads will share the same resources such as virtual memory and global variables, and context switching will not affect these resources
  • The thread has its own private data such as stack and registers, which need to be saved in advance when context switching

In summary, thread context switching will be divided into two parts

  • Two threads do not belong to the same process, so the resources are not shared, so the switching process will involve the context switching of the process
  • In the second case, the two threads belong to the same process. Because the virtual memory is shared, these resources remain intact when switching, only the private data of the thread and other unshared data need to be switched

This also shows from the side that thread switching within a process will save a lot of resources than switching between multiple processes. This is also an advantage of multithreading gradually replacing multiple processes.

So how is the system call executed ?

It's really one link after another. Is it like an interview? Yes, we should try our best to get the interviewer hooked every time we answer the interviewer and ask ourselves whether we can answer the question.

If the user mode program needs to execute system calls, it needs to switch to kernel mode execution. This process is shown in the following figure, a picture is worth a thousand words

 

 

System call procedure

Since it is divided into user mode and kernel mode, the permission levels of the two are not the same. The user mode program initiates a system call, because it involves permission issues and has to involve privileged instructions, so it will be executed by interruption, as shown in the figure above. Trap.

After an interrupt occurs, the kernel program starts to execute, and the processing is completed and the Trap is triggered to switch to the user mode work. Here again, the interrupt is involved. Let's briefly understand the interrupt in this article.

What did the interruption do?

Let's take the keyboard that we usually touch as an example. When we press the keyboard, the motherboard will notify the CPU after receiving the key. The CPU may be busy processing other programs at this time. You need to interrupt the currently executing program first, and then jump the PC pointer to fixed This is a brief description of an interruption.

However, our different combinations of keys correspond to different events, so we need to judge where the PC pointer jumps to according to the interrupt type. The location of the PC pointer is different depending on the interrupt type, so it is classified. This type is called Is the interrupt identification code. The CPU knows which address to jump to for processing through the PC pointer. This address is called the interrupt vector table.

For example, use number 8 to represent the identification code of key interrupt type A, and number 9 to represent the identification code of interrupt type B. When an interrupt occurs, the CPU needs to know which address the PC pointer points to, and this address is the interrupt vector.

Suppose we set up 255 interrupts, numbered from 0 to 255. In a 32-bit machine, almost 1k memory address is needed to store the interrupt vector. The 1k space here is the interrupt vector table.

Therefore, when the CPU receives an interrupt, it operates the PC pointer according to the interrupt type, finds the interrupt vector, modifies the interrupt vector, and inserts instructions to realize the jump function.

Processes and threads appear, so how to schedule ?

Computer resources are limited. Too many processes consume machines. Naturally, we can’t stand it. We humans are the same. Our stomachs are also limited. If we don’t eat a meal, we will panic, but if we eat too much, our feet will tremble. Find a way to deal with this problem. With a wave of both hands, since our CPU has a limited number of cores, how about assigning a time slice to each process, queuing it for execution, and letting another process execute directly after the given time?

That how allocation of time slices?

Suppose there are three processes at this time, process 1 only needs 2 time slices, process 2 needs 1 time slice, and process 3 needs 3 time slices. Halfway through process 1, I’m tired and don’t want to execute it anymore. I will take a break (hanging). Process 2 will be executed. Process 2 will be executed as soon as a shuttle. Process 3 will be executed immediately. 1 Start execution, so that the loop is executed according to the time slice, that is, time-sharing technology

 

 

Time-sharing technology

I just talked about the state of the process, so what are the states?

The cycle of a process is generally divided into the following three states

  • Ready state: After the process is created, it will start to queue, this time is called "ready state"
  • Running state: When everything is ready, the time and place are favorable, and the execution will start, and it is in the "running state" at this time
  • If the time slice is used up, it will become ready again

 

 

Ready to run

If the process is waiting for the completion of a process, it will enter a blocking state at this time

 

 

Process blocked

Why is the blocking state required?

Let’s think about it, sometimes the computer cannot respond to our request for various reasons, perhaps because it is waiting for the disk, or it may be because it is waiting for the printer. After all, it will not always meet our needs in time, so it tells us through an interrupt at this time. CPU, CPU passes control to the operating system by executing the interrupt handler, the operating system then changes the state of the blocked process to the ready state, arranges to re-queue, plus because the process enters the blocked state, there is nothing to do, but it can’t shrink. Let him go to the queue (because he needs to wait for the interrupt), so he enters the blocking state.

The following is a summary of the three states mentioned above

  • Ready state (Ready): Can be run, but other processes are temporarily stopped in operation
  • Running: At this time, the process occupies the CPU
  • Blocking state (blo ck): At this time, it may stop running because of waiting for related events (request IO/waiting for IO completion, etc.). At this time, even if the CPU control is given to it, it still cannot run

In fact, the process has two basic states

  • Creation state (New): The state when the process has just been created but has not been submitted. The main function is to allocate and establish process control blocks and other initialization tasks. The creation process has two stages. The first stage is to create the necessary management information for the new process. The second stage is to make the process enter the ready state
  • Termination state (Exit): The state in which the process exits, that is, resources other than the process control block are reclaimed. It is also divided into two stages. The first stage is to wait for the operating system to perform aftermath processing, and the second stage is to release the main memory.

So a total of five states are included. In order to be more intuitive, the transition diagram is as follows

 

 

Five forms

  • Null---->Created state: the first state initially created
  • Create state----->Ready state: A series of initialization is called ready state
  • Ready state----->running state: when the operating system schedules a process in the ready state and assigns it to the CPU, it becomes the running state
  • Running status------>End status: the status of being terminated by the operating system when the process completes the corresponding task or makes an error
  • Running state------>Blocking state: The operating system changes the process to the ready state due to the running out of the time slice
  • Blocking state------->Ready state: the process of blocking state waits for the end of an event to enter the ready state

In fact, it’s not a sell-off, there are actually two states, ready suspension and blocking suspension, then let’s see the difference between the two

  • Suspending is a behavior, and blocking is the state of the process
  • The reason that causes the process to hang is usually due to insufficient memory or user request, process modification, etc., and process blocking is that the process is waiting for an event to occur, which may be waiting for resources or response
  • Suspension corresponds to the activation of the behavior. The process in the external memory is transferred to the memory, and the process in the blocked state needs to wait for other processes or the system to wake up
  • Suspension is a passive behavior, the process is forced to transfer from memory to external memory, and entering blocking is an active behavior

In summary, now our process map has become seven states, as follows

 

 

Seven states of the process

The underlying principles of processes and threads

Above we understand the origin of processes, threads and state transitions, but obviously can’t let me understand processes and threads freely. As for how to represent them in memory, it’s still relatively empty, so we continue to look down.

How processes and threads are represented in memory

In the whole design process, two tables are involved, namely the process table and the thread table. The process table will record the location of the process in the memory, what its PID is, what its current state, how much space the memory allocates to it, and which user it belongs to. Assuming that there is no such table, the operating system does not know which processes are there. I don’t know how to schedule it. It’s like losing XXX and not knowing the direction.

 

 

Progress table

Especially need to pay attention to several parts of the process table

  • Resource information

The resource information will record which resources the process has, such as how the process and virtual memory are mapped, which files are owned, etc.

  • Memory layout

There are too many knowledge points in memory. If you write an article here, it will be very long, so I plan to use a separate article to write.

In Linux, the operating system uses virtual memory management technology to make processes have independent virtual memory space. The reason is relatively straightforward. Physical memory is not enough and insecure (users cannot directly access physical memory). Using virtual memory is not only safer and more secure. You can use a larger address space than physical memory.

In addition, in a 32-bit operating system, the 4GB process address space is divided into two parts, user space and kernel space. The user space is 0~3G, and the kernel address space occupies 3~4G. Users cannot directly manipulate the kernel space virtual address. , Only through system calls to access the kernel space.

The operating system will tell the process how to use memory, roughly divided into which areas and what each area does. Briefly describe the role of each segment in the figure below.

  • Stack: The system automatically allocates and releases the function parameter values, local variables, return addresses, etc. that are usually used here.
  • Heap: Stores dynamically allocated data, which is usually managed by the developer. If the developer does not release it after use, it may be recovered by the operating system after the program ends.
  • Data segment: global variables and static variables are stored. The initialized data section (.data) stores the global variables and static variables that are initialized, and the uninitialized data section, which is usually called the BSS section (.bss), stores the global variables and static variables that have not been initialized.

 

 

Process memory layout

  • Description

The description information includes the unique identification number of the process, the name of the process, and the user, etc.

In addition to arranging a table for the process, a table is also arranged for the thread, which is the thread table. The thread table also contains an ID, which is called ThreadID, and it also records its own status at different stages, such as blocked, running, and ready. Since multiple threads will share the CPU and need to switch constantly, it is necessary to record the values ​​of the program counter and registers.

Speaking of user-level threads and kernel-level threads, how close is the relationship between the two?

How to express the relationship between the two mappings?

It can be imagined that there is a thread pool in the kernel, which is given to the user space. Every time a user-level thread passes the program counter, etc., after the execution is completed, the kernel thread is not destroyed and waits for the next task. From this, it can be seen that the process of creation is expensive , The cost is high; the thread creation overhead is small and the cost is low.

Do so many processes share memory?

There are too many processes in the operating system. In order to allow them to perform their duties and not interfere with each other, consider allocating completely isolated memory areas for them. Even if the program reads the same memory address inside, their physical addresses are actually different. It’s as if my 501 in Block X is the same as your 501 in Block Y but it’s not a house. This is the address space

Therefore, under normal circumstances, process A cannot access the memory of process B, unless you plant a Trojan horse, maliciously manipulate the memory of process B or access it through inter-process communication as we will talk about later.

How does the process thread switch?

A large number of processes of the operating system need to be switched back and forth, maintaining the traditional virtue of not being difficult to borrow and then borrow. Before each switch, you need to record the memory address of the current register value, so that you can return to the original location next time to continue execution. Read from memory when resuming execution, and then resume state execution

 

 

Process switching

In order to let everyone understand this process in detail, I will split it into the following steps

  • The operating system senses that a process needs to be switched, and first sends an interrupt signal to the CPU to stop the current process
  • After the CPU receives the interrupt signal, the executing process will stop, and the kind operating system will find a way to save the current state first
  • After the operating system takes over the interrupt, execute a section of assembler to help register the state of the process before
  • When the operating system saves the state, it will execute the scheduler and let it decide the next process to be executed
  • Finally, the operating system will execute the next process

 

 

Process and interrupt

How to restore the previous process after interruption

As mentioned above, the operating system will execute a piece of code to help the process restore its state. One of its implementation methods is the first-in-last-out data structure of the stack, so right, the basic courses in universities are really important.

After the process (thread) is interrupted, the operating system is responsible for pushing key data (such as registers) on the stack. When resuming execution, the operating system is responsible for popping the stack and restoring the value of the register.

Coroutine

The first time I came into contact with coroutine was in an autonomous driving project. The colleagues who worked together said that coroutine was used at the bottom of the library, and I looked confused, huh? Ctrip? Ready to pack your bags and go home? I thought about it for a long time. There was a low-level library that used coroutines. At that time, I was stunned. The process and threads were enough to toss people. How come another coroutine? I wondered if there will be more interviewers. Thoughts of asking questions

  • What is a coroutine
  • What is the difference between coroutine and process, thread
  • What are the advantages and disadvantages of coroutines

How can you break your head if it is not bald? Okay, for the sake of life, no, I love computers and can’t stop the pace of learning, let’s see what this thing is

Why do we need coroutines?

When we perform multitasking, we usually use multithreading to execute concurrently. Let’s take the recent popular e-commerce promotion Moutai as an example. Regardless of whether Moutai is in the cache or the back-end data, the initial number of users is 10. Whenever 10 payment messages are received, 10 threads are opened to query the database. At this time, the number of users is small, and they can return immediately. The next day, the number of users will be increased to 100. Using 100 threads to query, the effect is really good. Increase the promotion effort. When there are 1000 people at the same time, it feels a bit difficult.

 

 

Growing thread

1000-10000, after reading the previous content, it should be clear that creating and destroying threads is quite resource intensive. Assuming that each thread occupies 4M of memory space, then 10000 threads will consume about 39G of memory, but the server also has 8G of memory.

The solution at this time either increases the server or improves the code efficiency. When multiple threads are performing operations, it is inevitable that a certain thread will be waiting for IO. At this time, it will block the current thread from switching to other threads, so that other threads will execute as usual. There is no problem when there are fewer threads. When the number of threads increases Problems will arise. The increase in the number of threads not only takes up a lot of memory space, but also the switching of too many threads will also take up a lot of system time.

 

 

Thread overhead

At this point, this problem can be solved by coroutine

The coroutine runs on the thread. After the coroutine is executed, you can choose to give up and let another coroutine run on the current thread. That is to say, the coroutine does not increase the number of threads, but runs multiple coroutines through time-sharing and multiplexing on the basis of threads. The key point is that its switching occurs in user mode, and there is no user mode to Kernel mode switching, lower cost

 

More knowledge-[ 115 video 139 resource 59 source code 75 professional guidance] -free to receive learning

Coroutine overhead

Analogous to the above, we only need to start 100 threads, and then each thread runs 100 coroutines to complete the above simultaneous processing of 10,000 tasks

So what are the main contents needed in the process of using the coroutine?

I just said that the coroutine runs on the thread. If the thread is blocked while waiting for IO, what will happen at this time? In fact, the operating system is mainly concerned with threads. When the coroutine calls blocking IO, the operating system will make the process in a blocked state. At this time, the current coroutine and the coroutine bound to the thread will be blocked and cannot be scheduled. It's uncomfortable

Therefore, in the coroutine, you cannot call the operation that causes the thread to block, that is, the coroutine is best combined with asynchronous IO to exert the greatest power

How to deal with the operation of calling blocking IO in the coroutine?

  • The short answer is that when blocking IO is called, a thread is restarted to perform the operation, and after the execution is complete, the coroutine reads the result. Is this similar to multithreading?
  • Encapsulate the system IO and change it to asynchronous call. At this time, a lot of work is required, so it needs the native support of the programming language

Therefore, it is not recommended to use coroutines for computationally intensive tasks. Computer-intensive tasks require a lot of thread switching, and thread switching involves too much resource exchange.

to sum up

The knowledge points involved in the thread process are very complicated. This article contains a series of knowledge points such as threads, what is a process, the difference between the two, kernel-level threads and user-mode threads, context switching of thread processes, and the process of system calls. The scheduling of the process will be introduced in detail.

Guess you like

Origin blog.csdn.net/Python6886/article/details/112828479