The process, the thread and the CPU work together to live a small life

The concept of process and thread

A process is an instance of the program execution, that is, it is a collection of data structures that the program has executed to the level of the class. From the kernel's point of view, the purpose of a process is to act as the basic unit for allocating system resources (CPU time, memory, etc.).

Each process has a completely different virtual address space. The operating system kernel is mapped to the physical address space through Address Translation technology (X86 processor architecture uses segment table + page table for mapping. The page table has 2 and 4 levels, 32 A bit system uses a 2-level page table, and a 64-bit system uses a 4-level page table), which gives the process the illusion that it monopolizes the entire memory space.

A thread is an execution flow of a process, and is the basic unit of CPU scheduling and dispatch. It is a basic unit that is smaller than a process and can run independently. A process is composed of several threads (user programs with many relatively independent execution streams share most of the data structure of the application), and the threads share all the resources owned by the process with other threads that belong to the same process. There is no isolation between threads . Although each thread has its own work stack space, it is also possible for thread A to access the work stack space of thread B. The behavior of a thread may affect other threads in the process. Therefore, the crash of one thread may cause a series of chain reactions leading to the crash of other threads, and finally even affect the crash of the process. Therefore, some core threads should try to ensure low coupling with other problem-prone threads.

The reduced coupling will also bring some other problems. For example, if a thread throws an exception without catching it, the corresponding thread will crash, but the corresponding process (JVM virtual machine) will not crash. This has both advantages and disadvantages. The advantage is that other threads are not affected, and the disadvantage is that if there is no peripheral monitoring, it is difficult to detect whether the corresponding thread has crashed.

After an overview of many aspects, we know that a multi-process program must be more robust than a multi-threaded program, but when the process is switched, it consumes more resources and is less efficient. But for some concurrent operations that require simultaneous execution and share certain variables, only threads can be used, not processes.

In general: the process has an independent address space, and the thread does not have a separate address space (threads in the same process share the address space of the process).

If it is said that readers who are not well-founded, the first time they see these contents, perhaps the concept of processes and threads is still not very clear. Don't worry, wait for me to explain the relationship between process, thread and CPU.

The relationship between the number of CPU cores and threads

Let me talk about some basic concepts first.

There is no doubt that a computer has only one CPU. After constantly learning the knowledge of multithreading, some friends already have a bunch of CPUs spinning around in the computer in their heads. In fact, there is only one.

But from the perspective of computer users, all programs are indeed running at the same time, so how does a CPU keep all programs running at the same time in the eyes of our users?

The shadow clone of CPU-CPU multi-core

Multi-core: also refers to single-chip multi-processor (CMP). CMP was proposed by Stanford University in the United States. Its idea is to integrate SMP (symmetric to processor) in massively parallel processors into the same chip. Each processor executes different processes in parallel. This relies on multiple CPUs at the same time. Running programs in parallel is an important direction to realize ultra-high-speed computing, which is called parallel processing.

In short, the core of this concept is to combine multiple processors into one chip to achieve the effect of one CPU multi-processor. And this multiprocessor is also called the CPU core.

Number of cores and threads: The current mainstream CPUs are multi-core. Increasing the number of cores is to increase the number of threads. Because the Cao Zu system generally executes tasks through threads, under normal circumstances they have a 1:1 correspondence, which means that a quad-core CPU generally has four threads. However, after Intel introduced hyper-threading technology, the number of cores and the number of threads formed a 1:2 relationship. (This 8 also means that 8 threads can be paralleled at most).

Picture 19.png

Therefore, when we refer to the number of threads later, we must think that it is linked to the "shadow clone" of the CPU-the number of cores.

How to understand resource allocation

In many classic operating system textbooks, the process is always defined as the execution instance of the program. It does not execute anything, but only maintains the various resources required by the application, and the thread is the real execution entity.

In order for the process to complete certain work, the process must contain at least one thread.

As we all know, CPU is very precious. So how to develop its computing efficiency to the fullest? The answer is the decoupling of resource allocation and processing tasks.

Our CPU represents a Java programmer who can only work in a daze, and his code ability is very strong. Even if he changes jobs and changes jobs, after a short period of adaptation, he can give full play to his ability to complete tasks efficiently.

So how to ensure that this powerful programmer can quickly adapt to the environment when he arrives in a new environment? The answer is to allocate resources in advance, which is the concept of our process. If our operating system is a company, then a process is like a department, which exists to solve a type of business . At this time, we discovered that the term "department" does not work at all, and the same is true for the corresponding process. However, the significance of the existence of the department is still the top priority, for two reasons:

  1. As the unit of company (operating system) resource allocation. Now that the company has gotten a batch of computers, our distribution target is often a department, and it is very unlikely that we will purchase independently for a programmer. (From the perspective of the operating system's allocation of process resources. Don't tell me anything about the company's prince. The world of programs has no relationship with people, only about efficiency!). After being allocated by the operating system, the process has its own address space.
  2. Let the tasks (threads) in the department (process) have a sense of belonging. All threads know that they belong to this department and can use the resources of this department. In other words, the thread will use the address space of the process. If you have studied JVM, you know the concept of runtime data area. The entire runtime data area is a work area allocated by the operating system to the Java process.

Often resource allocation should be done well in advance, that is, the context environment should be arranged in advance. We can imagine that if a thread obtains CPU execution rights, but its context is unclear, causing the CPU to be unable to immediately enter the battle, there will be a significant loss of performance.

How to understand processor scheduling

In the introduction just now, we compared the operating system to a company, and the process is a department. And a department must have at least one or more business support. The business at this time represents the thread. The CPU is our Java developers, and how do our developers perform these services?

The basic unit of scheduling in Java is the thread, that is, the linux kernel thread, which is a lightweight process, based on preemptive scheduling. You can adjust the probability that the thread will preempt the CPU core execution right in the future by obtaining its priority when creating a new thread. But the essence is still preemptive scheduling.

Corresponding to our program, when there are many threads (tasks) and fewer CPU cores (developers), we stand from the perspective of threads: threads compete for CPU cores to process our tasks. From the perspective of the CPU, threads are randomly selected for processing.

Is there Tang Bohu photographed by Zhou Xingchi in Qiuxiang? When the meal arrived, a group of people swarmed up and grabbed the food in the bucket clean? The thread that has not grabbed the CPU execution rights can only look like Hua An played by Stephen Chow, with a dumb face.

CPU time slice rotation mechanism (RR scheduling, concurrency principle)

In other words, in the process of frequent switching of multi-core CPUs, there is no longer any relationship with the concept of process. Suppose there are 3 processes in our operating system, and each process contains 8 threads. Then there are 24 threads in total. When the CPU allocates the number of cores to process resources, these 24 threads are treated equally. It is not because your 8 threads are in one process, you will focus on assigning more cores to process.

Time slice round-robin scheduling is the oldest, simplest, fairest and most widely used algorithm. Also known as RR scheduling. Each process is assigned a time period, called its time slice, that is, the time that the process is allowed to run.

If the process is still running when the time slice ends, the CPU will be deprived and assigned to another process. If the process blocks or ends before the time slice ends, the CPU immediately switches. All the scheduler has to do is a list of ready processes, and when the process runs out of its time slice, it is moved to the end of the queue.

The only interesting point in time slice round-robin scheduling is the length of the time slice. It takes time to switch from one process to another, including saving and loading register values ​​and memory images, updating various tables and waiting in queues. If it takes 5ms for the process to switch context, and suppose that the time slice is set to 20ms, after 20ms of useful work is done, the CPU will spend 5ms for process switching. At this time, 20% of the CPU time is wasted on management overhead.

In order to improve CPU efficiency, we can set the time slice to 5000ms. At this time, only 0.1% of the time wasted. But considering that in a time-sharing system, if 10 interactive users press the Enter key almost simultaneously, what will happen? Assuming that all other processes use their time slices (5000ms=1s), then the last two users will have to wait at least 5s to get a chance. Most users cannot tolerate a short command that takes 5 seconds to respond. The same problem can also occur on a personal computer that supports multiple programs.

The conclusion can be summarized as follows: too short a time slice setting will lead to too many process switching and reduce CPU efficiency. However, if the design is too long, it may cause poor response to short interaction requests. Setting the time slice to 100ms is usually a reasonable compromise.

Process scheduling

While reading the concepts, we gradually discovered some problems. That is , the unit of time slice round-robin scheduling is the process . Does this conflict with the concept that threads are the smallest unit of processor scheduling?

 

I Picture 20.pnghave developed in peacetime, the feeling did not limit the number of CPU-core (obviously only the largest parallel threads 8, 3000 Why multiple threads to run simultaneously but we think it is normal?) This is because the operating system provides a CPU time Film rotation mechanism.

We now have two implementation schemes in order to handle these more than 3000 threads.

  1. Directly assign these 8 cores to more than 3000 threads at random, that is, let all threads seize the CPU execution right at the same time.

Assuming this is the implementation method, what problems will arise. That may cause thread starvation. In case some threads are unlucky and cannot grab the CPU execution rights for a long time, then the status of this process feedback to us must be frequent freezes.

  1. It is to think that the process is a unit, and the CPU is allocated to the threads in this process uniformly. Just let the threads in this process compete for the CPU execution rights, then the probability of thread starvation can be greatly reduced, and all CPU cores can concentrate on doing one. Tasks under the process.

What are the benefits of doing this?

  1. Significantly reduce the problem of thread starvation.
  2. Realize the controllability of the number of CPU cores in a process. Therefore, both CPU-intensive and IO-intensive services can calculate a more reasonable thread pool based on the number of CPU cores.

Based on the emergence of this problem, we have introduced the concept of scheduling in units of processes.

We know that threads preempt the CPU core for processing tasks. So how does the process allocate CPU resources?

The system maintains a list of ready processes, which is actually a first-in-first-out queue. The new process will be added to the end of the queue. Then every time the process scheduling is executed, the head process of the queue will be selected and let it Run a time slice on the CPU, but if the allocated time slice has been exhausted and the process is still running, the scheduler will stop the process and move it to the end of the queue. The CPU will be deprived and Assigned to the process at the head of the queue, and if the process blocks or ends before the end of the time slice, the CPU will switch.

In this way, compare the method of directly assigning these 8 cores to these more than 3000 threads. We found that maintaining this process queue inevitably consumes more context switching costs. But compared with the advantages we mentioned before, it is also quite worthwhile.

to sum up

Finally, do some questions. Back to our original question: How does a CPU keep all programs running simultaneously in the eyes of our users?

First of all, at the process level, a long period of time is divided by the CPU into multiple time fragments, and then all processes are executed concurrently (switching back and forth) during this period of time, and all processes here are lined up.

After the CPU is allocated to a process, due to the hyper-threading technology, the CPU has the number of threads that can be executed in parallel with the number of built-in cores*2. If the program needs to reduce the frequent switching of CPU cores among threads in a process, then the number of threads in the process should be set as less than the number of CPU cores of the machine*2.

 

Guess you like

Origin blog.csdn.net/weixin_47184173/article/details/115107534