[Linux Basics] - Don't say you don't understand Linux memory management anymore, 10 pictures are arranged for you clearly!

The following article comes from the back-end technology school, the author LemonCoder

Prerequisite agreement: This article discusses the technical content premise, the operating system environment is a 32-bit Linux system with an X86 architecture.

Virtual address


Even in modern operating systems, memory is still a very precious resource in a computer. Just look at the number of solid-state hard drives in your computer, and then look at the size of the memory.

In order to make full use and management of system memory resources, Linux uses virtual memory management technology, using virtual memory technology to allow each process to have 4GB of non-interference virtual address space.

The initial allocation and operation of the process are based on this "virtual address". Only when the process needs to actually access the memory resources will the mapping between the virtual address and the physical address be established and transferred to the physical memory page.

To make an unsuitable analogy, this principle is actually the same as the current XX network disk. If your network disk space is 1TB, do you really think it gives you such a large space in one go? It’s still too young. Space is allocated to you when you put things in it. You can allocate as much real space as you put, but you and your friends look like everyone has 1TB of space.

The benefits of virtual addresses:

  • Prevent users from directly accessing physical memory addresses, prevent some destructive operations, and protect the operating system;

  • Each process is allocated 4GB of virtual memory, and user programs can use a larger address space than actual physical memory.

The 4GB process virtual address space is divided into two parts: "user space" and "kernel space":

                                      User space kernel space

Physical address


In the above chapters, we already know that whether it is user space or kernel space, the addresses used are all virtual addresses. When the process needs to actually access the memory, the "page fault exception" will be generated by the kernel's "request paging mechanism" and transferred to the physical memory. page.

Converting the virtual address into the physical address of the memory involves using the MMU memory management unit (Memory Management Unit) to segment and page the virtual address (segment page type) address translation. The specific process of segmentation and paging is no longer here. For details, please refer to the description of any textbook on the principles of computer organization.

                                                                                Segment page memory management address translation

The Linux kernel divides the physical memory into 3 management areas, namely:

  1.  ZONE_DMA : DMA memory area. Contains memory page frames between 0MB and 16MB, which can be used by old ISA-based devices through DMA and directly mapped to the kernel's address space.
  2.  ZONE_NORMAL : Normal memory area. Contains memory page frames between 16MB and 896MB, regular page frames, which are directly mapped to the kernel's address space.
  3.  ZONE_HIGHMEM : High memory area. The memory page frame containing more than 896MB is not directly mapped. This part of the memory page frame can be accessed through permanent mapping and temporary mapping.

                                          

                                                             Physical memory partition

User space


The user process can access the "user space", each process has its own independent user to overcome difficulties, the virtual address range from 0x00000000 to 0xBFFFFFFF total capacity 3G.

User processes usually can only access virtual addresses in user space, and can only access kernel space when performing trapping operations or system calls.

Process and memory

The user space occupied by the process (executed program) is divided into 5 different memory areas according to the principle of "storing the address spaces with the same access attributes together". Access attributes refer to "readable, writable, executable" and so on.

  • Code snippet

The code segment is used to store the operating instructions of the executable file, the image of the executable program in the memory. The code segment needs to be prevented from being illegally modified at runtime, so only read operations are allowed, it is not writable.

  • Data segment

The data segment is used to store the initialized global variables in the executable file, in other words it is to store the variables and global variables statically allocated by the program.

  • BSS segment

The BSS segment contains uninitialized global variables in the program, and all the bss segments in the memory are set to zero.

  • Heap

The heap is used to store the memory segment that is dynamically allocated during the running of the process. Its size is not fixed and can be dynamically expanded or scaled. When the process calls malloc and other functions to allocate memory, the newly allocated memory is dynamically added to the heap (heap is expanded); when free and other functions are used to release memory, the released memory is removed from the heap (heap is reduced) .

  • Stack

The stack is a local variable created temporarily by the user to store the program, that is, the variable defined in the function (but does not include the variable declared by static, which means storing the variable in the data segment). In addition, when the function is called, its parameters will also be pushed into the calling process stack, and after the call is over, the return value of the function will also be stored back on the stack. Due to the first-in-first-out feature of the stack, the stack is particularly convenient to save/restore the calling scene. In this sense, we can regard the stack as a memory area for storing and interacting temporary data.

The data segment, BSS segment, and heap in the above-mentioned memory areas are usually stored continuously in the memory, and they are continuous in location, while the code segment and the stack are often stored separately. In the i386 architecture, the two areas of the heap and the stack are "stack downward expansion" and "heap upward expansion", which are relatively born.

                                                                        

You can also use the size command under linux to view the size of each memory area of ​​the compiled program:

[lemon ~]# size /usr/local/sbin/sshd
   text	   data	    bss	    dec	    hex	filename
1924532	  12412	 426896	2363840	 2411c0	/usr/local/sbin/sshd

 

Kernel space


In the x86 32-bit system, the Linux kernel address space refers to the high-end memory address space with virtual addresses starting from 0xC0000000 to 0xFFFFFFFF, with a total capacity of 1G, including kernel mirroring, physical page tables, and drivers running in the kernel space.

                                                    

                                                                                               Kernel space subdivision area

Direct mapping area

Direct Memory Region: Starting from the starting address of the kernel space, the maximum kernel space address range of 896M is the direct memory mapping area.

The 896MB "linear address" of the direct mapping area is directly mapped with the first 896MB of the "physical address", which means that the linear address and the assigned physical address are continuous. The physical address corresponding to the linear address 0xC0000001 of the kernel address space is 0x00000001, and there is an offset between them PAGE_OFFSET = 0xC0000000.

There is a linear conversion relationship between the linear address and the physical address in this area. "Linear address = PAGE_OFFSET + physical address". You can also use the virt_to_phys() function to convert the linear address in the kernel virtual space into a physical address.

High-end memory linear address space

The kernel space linear address ranges from 896MB to 1G. The 128MB address range is the high-end memory linear address space. Why is it called the high-end memory linear address space? Let me explain to you:

As mentioned earlier, the total size of the kernel space is 1GB, and the 896MB line-limiting address starting from the starting address of the kernel space can be directly mapped to the address range with a physical address size of 896MB.

Ten thousand steps back, even if the 1GB linear addresses in the kernel space are mapped to physical addresses, it can only address a physical memory address range of 1GB in size at most.

Therefore, the kernel space took out the last 128MB address range and divided it into the following three high-end memory mapping areas to address the entire physical address range. On 64-bit systems, there is no such problem, because the available linear address space is much larger than the installable memory.

Dynamic memory mapping area

vmalloc Region This region is allocated by the kernel function vmalloc. Its characteristic is that the linear space is continuous, but the corresponding physical address space is not necessarily continuous. The physical page corresponding to the linear address allocated by vmalloc may be in the low-end memory or in the high-end memory.

Permanent memory mapped area

Persistent Kernel Mapping Region This area can access high-end memory. The access method is to use alloc_page(__GFP_HIGHMEM) to allocate the high memory page or use the kmap function to map the allocated high memory to this area.

Fixed mapping area

Fixing Kernel Mapping Region This area and the top of the 4GB only have a 4k isolation zone, and each address item serves a specific user, such as ACPI_BASE.

                                    

                                                                         Kernel space physical memory mapping

Recap


There is a lot of the above. Don't worry about entering the next section. Before that, let's review the above. If you read the above chapter carefully, I have drawn another picture here, and now you should have such a global picture of memory management in your mind.

                                                                                    Kernel space user space full map

Memory data structure


For the kernel to manage the virtual memory in the system, it is necessary to abstract the memory management data structure from it. Memory management operations such as "allocation, release, etc." are all based on these data structure operations. Here are two data structures that manage the virtual memory area.

User space memory data structure

In the previous chapter of "Process and Memory" we mentioned that Linux processes can be divided into 5 different memory areas, namely: code segment, data segment, BSS, heap, stack, and the kernel manages these areas by dividing these memory areas Abstracted into a memory management object of vm_area_struct.

vm_area_struct is the basic management unit that describes the address space of a process. A process often needs multiple vm_area_structs to describe its user space virtual address, and it needs to use "linked lists" and "red-black trees" to organize each vm_area_struct.

The linked list is used when all nodes need to be traversed, and the red-black tree is suitable for locating a specific memory area in the address space. The kernel can achieve high performance for various operations on the memory area, so both data structures are used at the same time.

Address management model of user space process:

                                               

                                                                                           vm_area_struct

Kernel space dynamically allocates memory data structure

In the kernel space chapter, we mentioned the "dynamic memory mapping area". This area is allocated by the kernel function vmalloc. The characteristic is that the linear space is continuous, but the corresponding physical address space is not necessarily continuous. The physical page corresponding to the linear address allocated by vmalloc may be in the low-end memory or in the high-end memory.

The address allocated by vmalloc is limited to between vmalloc_start and vmalloc_end. Each piece of kernel virtual memory allocated by vmalloc corresponds to a vm_struct structure, and there is a 4k-size anti-cross-border free area between different kernel space virtual addresses.

Like the virtual address characteristics of user space, these virtual addresses do not have a simple mapping relationship with physical memory. They must be converted to physical addresses or physical pages through the kernel page table. They may not have been mapped, and they will be real when a page fault occurs. Allocate physical pages.

  

                                                                                           Dynamic memory mapping

in conclusion


LinuxMemory management is a very complex system. The description in this article is just the tip of the iceberg. It will show you the whole picture of memory management from a macro perspective, but generally speaking, this knowledge is still sufficient when you chat with the interviewer. Of course, I hope that Everyone can understand deeper principles by reading books.

This article can be used as a study guide like an index. When you want to learn a certain point in depth, you can find the entry point in these chapters and the position of this knowledge point in the macroscopic view of memory management.

During the creation of this article, I also drew a lot of sample illustrations, which can be used as a knowledge index. Personally, I feel that looking at the pictures is clearer than looking at the text. You can reply to "Memory Management" in the background of my public account "Back-end Technology School" to get these pictures High-definition original image.

Old rules, thank you for reading. The purpose of the article is to share your understanding of knowledge. I will repeatedly verify the technical articles to ensure the maximum accuracy. If there are obvious deficiencies in the article, please also point it out. Let's learn together in the discussion. Today's technology sharing is here, and we will see you in the next issue.

Originality is not easy. Seeing this, if I have a little bit of gain, then using my fingers to "forward" and "watching" is the greatest support for my continuous creation.

Reference


  • "Linux Kernel Design and Implementation (Original Book 3rd Edition)"
  • Linux memory management https://cloud.tencent.com/developer/article/1515762
  • A preliminary study of linux memory management https://cloud.tencent.com/developer/article/1005671
  • linux memory management source code analysis-page frame allocator https://www.cnblogs.com/tolimit/p/4551428.html
  • Linux kernel-kernel address space distribution and process address space https://my.oschina.net/wuqingyi/blog/854382
  • Linux memory management http://gityuan.com/2015/10/30/kernel-memory/
  • Where is the Linux Used memory? http://blog.yufeng.info/archives/2456

Guess you like

Origin blog.csdn.net/u014674293/article/details/106409398