Linux: Program address space and common page replacement algorithms

Research background and definition

Environment : 32 as the platform

Before understanding the program address space, we need to know:

Address : memory address ------ a number for the memory with bytes as the storage unit; the specific corresponding memory unit can be found through the address

Program : It's actually a bunch of code, saved in a file (disk)

The size of the computer's physical memory is fixed, that is to say, the actual physical memory on the memory slot of the computer motherboard can be directly addressed by the cpu, and the capacity of the physical memory is fixed, but the addressing space depends on the number of cpu address lines. On a 32-bit system, the linear address space can reach 4G (2^32); this 4G is generally allocated at a ratio of 3:1, the user process enjoys 3G of space, and the kernel alone enjoys the remaining 1G of memory
Insert picture description here

Virtual Memory

When the compiler compiles the program to generate an executable file, it will sort each instruction and each data with an address. When the program runs, it will put the instructions and data in the specified memory address location. The
cpu will follow the address Offset and execute commands step by step, and find the corresponding data for processing
(note: the memory will be occupied only after the program runs-so the program address space is usually called-process address space)
Process address space: in fact, it is also a virtual address space , Is a fake address space described by the operating system as a process through a mm_struct structure.
So why does the operating system not allow the process to directly access the physical memory, but the virtual memory address? What is wrong if the process directly accesses the physical memory?

1. When the program is compiled, the compiler will address the instructions and data; but if the memory of a certain address is already occupied, the program will not run-the address management of the compiler is troublesome
2 . The process directly accesses the physical memory. If there is a wild pointer, you may change the data in other processes during operation (memory access control cannot be performed)
3. Program running and loading usually requires a continuous memory space. The utilization of memory is relatively low.
Note : Virtual addresses do not have storage capacity, and data storage still has to be placed in memory

Three methods of managing memory in the operating system:

1. Segmented

Definition: Divide the address space of the user program into several different segments of fixed size. During storage allocation, the segment is used as a unit to realize discrete allocation.
The organization of the virtual address is divided into segment number + segment offset (for example, there are many variables in the global data segment, and their segment numbers are the same, which means that the offset of the starting variable of the physical memory segment is different) Therefore, the starting address of the physical memory segment corresponding to the segment number and the offset of the address in the virtual address form a complete physical address to find the corresponding physical memory unit
mapping logic: parse out the segment number, and use the segment number in the segment table Find the corresponding segment table entry, take out the physical address in the segment table, and add it to the offset in the segment
Insert picture description here

Advantages : It can be written and compiled separately, different protections can be adopted for different types of segments, and can be shared by segment, including code sharing through dynamic linking. It is more friendly to the address management of the compiler.
Disadvantages : Fragmentation occurs, and the problem of low utilization of memory for continuous data storage is not solved.

2. Pagination

Definition: Divide the address space of the user program into several fixed-size areas, called "pages" or "pages". It also puts any page of the user program into any physical block, realizing discrete allocation.
How the page table finds the physical address through the virtual address
Page table: page number, physical address, page missing interrupt bit...
Virtual address composition: page number + page offset
mapping logic: get the virtual address, parse out the page number, pass Page number Find the corresponding page table entry in the page table, take out the physical address in the page table entry, and add it to the offset within the page.
Insert picture description here
Advantages : (1) The paging system does not generate external fragments, and the memory space occupied by a process It may not be continuous, and the virtual page of a process can be placed on the disk when it is not needed.
(2) The paging system can share small addresses, that is, page sharing. You only need to make a relevant record in the page table entry corresponding to a given page.
Disadvantage : The page is very large and will take up a lot of memory space.

3. Paragraph

It is the combination of segmentation and paging. First, the virtual address space is managed in segments, and then paging is managed in each segment. It is the actual memory management method.

Midpage Missing

Definition: After the page table entry is found through the virtual address, if the data originally stored at the current address is no longer in the memory, this phenomenon will be triggered.
So why is a piece of data no longer in the memory?
This is because when the physical memory is not enough, the operating system will find a piece of physical memory according to a certain algorithm, and put the data in it in the swap area of ​​the disk for storage. When the page fault interrupt is triggered, this piece of memory The original data in will be replaced back.

Solution-page replacement algorithm

First of all, we need to be clear: the main goal of selecting pages during page replacement is to reduce the number or probability of subsequent page fault interruptions. Therefore, the selected page should be a page that will not be accessed for a long time, preferably a page that will never be accessed again. If possible, it is best to choose a page that has not been modified, so that the content of the replaced page does not need to be written back to the disk during replacement, thereby further speeding up the response speed of page fault interruption. Here are some page replacement algorithms for you.

1. Optimal replacement algorithm

The algorithm is a theoretical algorithm developed by Belasy in 1966. The idea of ​​this algorithm is to select the pages that will not be used for the longest time in the future from the pages already in memory and replace them; however, it is unpredictable in the display. The performance of this algorithm is the best, because it has reached 100% judgment on the future, so it also determines that it cannot be used in practice. The only function is to measure other algorithms.

2. First-in-first-out (FIFO) page replacement algorithm

The FIFO algorithm is the earliest replacement algorithm. The core idea of ​​the algorithm is to always eliminate the page that enters the memory first, that is, the page that has the longest resident time in the memory is selected to be eliminated. This algorithm is simple to implement. It only needs to link the pages transferred from a process into memory into a queue in sequence, and set a pointer, which is called a replacement pointer. Let it always point to the oldest page.
Insert picture description here
Disadvantages: The algorithm does not fit the actual running law of the process, because in the process, some pages are frequently accessed, such as pages containing global variables, commonly used functions, routines, etc. The FIFO algorithm does not guarantee that these pages will not be eliminated

3.LRU (Least Recently Used) replacement algorithm

The least recently used (LRU) page replacement algorithm makes a decision based on the usage of the page after it is transferred into memory. The core idea is to select the pages that have not been visited for the longest time in the past to be eliminated. It believes that the pages that have not been visited in the past period of time may not be visited in the near future. The algorithm sets an access field for each page to record the time elapsed since the page was last accessed. When the page is eliminated, the one with the largest value among the existing pages is selected to be eliminated.

4. Least Frequently Used (LFU) replacement algorithm

The LFU algorithm is the same as the least recently used algorithm. Both use the recent past to predict the nearest future; in this algorithm, the basis is the number of uses in the recent period of time, and the page that has been used the least in the recent period is selected as the replacement page. . When using the LFU algorithm, a shift register should be set for each page in the memory to record how often the page is accessed (number of accesses)

5. Simple Clock replacement algorithm-not recently used (NRU) algorithm

The simple CLOCK algorithm is to set an access bit for each page, also known as the use bit. Then all the pages in the memory are linked into a circular queue through the link pointer. When a page is loaded into the main memory for the first time, the access bit of the frame is set to 1; when the page is subsequently accessed, the access bit is also set to 1.
For page replacement, the algorithm organizes the pages into a circular queue, and each page has an access bit. If the page has been visited, its access position is set to 1, otherwise the access position is set to 0; when looking for a replacement page When encountering the first page with the access bit of 0, replace it. If you encounter the page with the access bit of 1, still set its access position to 1, and then continue to search;
Insert picture description here

由于该算法循环检查各个页面的使用情况,所以叫做Clock算法;

Guess you like

Origin blog.csdn.net/qq_43825377/article/details/113803966