[Linux kernel] memory management - malloc memory allocation

Reprint please specify: https://www.cnblogs.com/Ethan-Code/p/16618880.html

How malloc allocates memory

When learning C language, we know that malloc is dynamic memory allocation, that is, the specified memory size will be allocated from the heap when the program runs and executes the malloc function.

First of all, we must know that Linux divides memory into six areas: stack area, file mapping area, heap area, BSS area, data area, and code area.

Partitions are only logically divided on virtual memory, and there is no concept of logical partitions on physical memory.

When using malloc, memory is allocated from two areas, namely the file mapping area and the heap area .

  • When the requested memory is less than 128K, the brk() system call will be used to allocate memory from the heap.
  • When the requested memory is greater than 128K, the memory will be allocated from the file mapping area through the mmap() system call.

The difference between the two allocation methods is discussed below.

brk allocates memory

When using brk() to allocate memory, the kernel just moves the top pointer of the brk heap upwards , that is, expands the heap space. And the mapping from physical memory to virtual memory heap area will not be established immediately, only when a page fault exception occurs when accessing newly allocated memory, the mapping will be performed.

When using free() to release heap memory, if the memory to be released is not at the top of the heap, the kernel will not release the memory immediately , but will reclaim the memory, mark this part of the memory as free, and will not cancel the established memory mapping. The purpose of this is to directly use this piece of free memory in the next malloc, thereby reducing the number of system calls. But there is a problem with this method. This part of the memory that is recovered but not released will cause memory fragmentation .

Therefore, a threshold of 128K is set, and the brk system call is used only when the application is lower than 128K, so as to avoid too large memory fragmentation.

When the requested memory is relatively small (<128K), such as malloc(1), the kernel will move the brk heap top pointer to a space greater than 1. On the one hand, it is necessary to add memory with a size of 0x10 (16Bytes) in front of this part of memory as control information (which can be provided to free to release the memory range); on the other hand, because of the brk() mechanism, the operating system will reserve More memory is provided for subsequent malloc applications, thereby reducing the number of system calls and improving efficiency.

In the 32-bit Linux environment, the memory of malloc less than 128K will always divide the heap space of 132K . The origin of 132K is as follows: 132K = 128K + 4K. The reason is: when the existing free memory on the heap is not enough to provide the size requested by malloc, the brk system call will move the top pointer of the brk heap up by 128K at one time, and at the same time reserve the memory size for storing control information, so it will be slightly It is greater than 128K, plus the memory paging is aligned by 4K , so the top pointer of the brk heap is moved up by 132K.

Unless there is enough free memory on the heap during malloc, the brk heap top pointer will not be moved, but the first address of the free memory will be returned directly according to the requested memory size, and the remaining memory will continue to remain on the heap as free memory.

In addition, during free, if there is more than 128K free memory at the top of the heap, memory compaction (trim) will be triggered, and the free memory exceeding 128K will be released directly , and the remaining 128K will be reserved for the next malloc allocation.

The above-mentioned "free memory" is actually managed by the ptmalloc memory pool in the kernel. Using the boundary marking method, the memory cached in the memory pool is divided into many blocks. Each block is called a chunk, and each chunk is There is 16Bytes of control information, which is used to describe the size of the chunk and the chunk information before and after. ptmalloc inserts it into different bin linked lists according to the size of the chunk. Later, I have time to write a special article on the underlying call process of malloc.

You can refer to: https://blog.nowcoder.net/n/de03980b2ab746ccad65fcb38bee59c3

mmap allocates memory

When applying for a large memory, mmap() will be used to allocate memory. No matter how large the requested memory is, as long as it is within its available virtual address space, the kernel will allocate it to the process. Since the mmap() allocation method does not have a memory pool cache mechanism, each allocated memory will trigger a page fault interrupt when the process reads and writes, and the kernel will map a piece of physical memory from the file mapping area to virtual memory. Moreover, the memory requested by mmap() will be completely returned to the operating system when it is free, canceling the established memory mapping.

Therefore, frequent use of mmap for dynamic memory allocation has a relatively large impact on CPU performance. Because there is no memory pool cache, page fault interrupts will be triggered every time the requested memory is accessed, and system calls are frequently used to enter the kernel state from user mode. It is relatively expensive for the CPU.

Therefore, the threshold of 128K is set, and the memory allocation below 128K adopts the brk method with the memory pool cache mechanism, which can reduce the number of page fault interrupts and system calls. When it is higher than 128K, the mmap method of anonymous memory mapping area is used to avoid too large memory fragmentation.

Another use of the mmap system call is for file mapping.

Files are generally accessed through a series of file IO operating system calls . Open the file with open, load the disk file into the memory, read and write through read, write, and finally close the file.

If there are many reads and writes, frequent system calls will occur. Therefore, in order to improve performance, you can use mmap to map the disk file directly to the virtual memory after opening the file, and then only need to read and write access to this mapped virtual memory. The CPU will trigger a page fault interrupt, load the disk file into the main memory and establish a mapping with the virtual memory. After that, no matter whether the file has been closed or not, as long as the memory is read and written, the system will automatically write back the dirty pages to the disk, which is equivalent to completing file operations such as read and write, and there is no need to switch the state through the system call. .

Can refer to: https://blog.nowcoder.net/n/850befe7c86d446ab73c89c5f18d8dcb?from=nowcoder_improve

Access memory after free

What happens when freed memory is used?

  • If the free is the brk heap memory, since the memory on the heap will not be returned to the operating system immediately after free, the program will continue to execute, which will cause potential UAF (Use After Free) problems.
  • If free is the memory at the top of the heap, and the free memory at the top of the heap is greater than 128K, using more than 128K of memory will directly cause a segment fault (Segment Fault).
  • If the memory in the mmap mapping area is freed, since the memory in the file mapping area will be returned immediately after free and the memory mapping will be cancelled, accessing the memory again will cause a segment fault (Segment Fault) problem.

Good article recommendation: https://www.cnblogs.com/sky-heaven/p/10005642.html
https://xiaolincoding.com/os/3_memory/malloc.html#malloc-%E6%98%AF%E5%A6 %82%E4%BD%95%E5%88%86%E9%85%8D%E5%86%85%E5%AD%98%E7%9A%84

Guess you like

Origin blog.csdn.net/weixin_45636061/article/details/127184771