Linux kernel source code analysis - which memory types of the process are likely to cause memory leaks?

The main learning content of the Linux kernel can be divided into three parts: process, memory and protocol stack.

Let's talk about memory leaks today.

  • I believe that you should have encountered the following scenarios in your usual work:
  • As background tasks in the server continue to run, less and less memory is available in the system;
  • The application was suddenly killed by OOM while it was running;
  • The process does not seem to consume much memory, but the system memory is not enough;
  • ……

Similar problems are likely to be caused by memory leaks. We all know that a memory leak refers to the memory that has not been released after being allocated, so that this part of the memory cannot be used again, and even more seriously, the pointer to this memory space no longer exists, and it can no longer be used. access this memory space.

The memory leak we usually encounter may be the memory leak of the application program or the memory leak of the kernel (operating system); and the memory leak of the application program may be a leak of heap memory (heap) or a memory map area ( Memory Mapping Region) leaks. These different types of memory leaks have different manifestations and solutions, so in order to better deal with memory leaks, we first need to understand these different memory types.

These different memory types can be understood as part of the process address space (Address Space), so how does the address space work?


process address space

We use a graph to represent the address space of a process. The left side of the figure shows how the process can change the process virtual address space, while the middle shows how the process virtual address space is divided, and the right side shows the physical memory or physical address space corresponding to the process's virtual address space.

 Let's talk about this process in detail.

The application will first call the functions related to memory application and release, such as malloc(3), free(3), calloc(3) provided by glibc; or directly use the system calls mmap(2), munmap(2), brk( 2), sbrk(2), etc.

If you use library functions, these library functions are actually the encapsulation of system calls in the end, so it can be understood that the application program dynamically applies for memory release, and finally it needs to go through mmap(2), munmap(2), brk(2), sbrk(2) etc. these system calls. Of course, from library functions to system calls, this also involves some memory-level optimizations performed by these libraries themselves. For example, malloc(3) may call mmap(2) or brk(2).

Then these system calls related to memory application and release will modify the address space of the process (address space), where brk(2) and sbrk(2) modify the heap (heap), while mmap(2) and munmap(2) modify The Memory Mapping Region (memory mapping area).

Please note that these are all for virtual addresses, and applications deal with virtual addresses, not physical addresses directly. The virtual address must be converted into a physical address in the end. Since Linux uses Page (page) for management, this process is called Paging (page).

We use a table to briefly summarize the different types of memory corresponding to these different application methods. This table also includes the Page Cache we talked about in a module of the course, so you can understand it as a process applying for memory Big summary of types:

 There are a lot of terms involved here, let's briefly introduce the important parts, and see which parts of this table are prone to memory leaks.

There are many types of memory required by a process to run. In general, these memory types can be distinguished from two different dimensions: whether it is file mapping or not, and whether it is private memory, that is, it can be divided into the above listed Four types of memory.

  • Private anonymous memory. The heap, stack, and mmap (MAP_ANON | MAP_PRIVATE) of the process all apply for this type of memory. The stack is managed by the operating system, and the application does not need to pay attention to its application and release; the heap and private anonymous mapping are managed by the application (programmer), and their application and release are all by the application to be responsible, so they are prone to memory leaks.
  • Shared anonymous memory. The process uses mmap (MAP_ANON | MAP_SHARED) to apply for memory, such as tmpfs and shm. This type of memory is also managed by the application, so memory leaks may also occur.
  • Private file mapping. Processes apply for memory through mmap(MAP_FILE | MAP_PRIVATE). For example, processes map shared libraries (Shared libraries) and executable code segments (Text Segment) to their own address space in this way. For the mapping of shared libraries and code segments of executable files, this is managed by the operating system, and applications do not need to pay attention to their application and release. The memory requested by the application directly through mmap (MAP_FILE | MAP_PRIVATE) needs to be managed by the application itself, which is where memory leaks may occur.
  • Shared file mapping. The memory applied by the process through mmap(MAP_FILE | MAP_SHARED), the File Page Cache we talked about in the last module course belongs to this type of memory. This part of memory also needs to be applied for and released by the application, so there is also the possibility of memory leaks.

After understanding the different memory types of the process virtual address space, let's continue to look at their corresponding physical memory.

We also mentioned just now that the process virtual address space is mapped to physical memory through Paging. The memory requested by the process calling malloc() or mmap() is all virtual memory, and only writing to these memory After the data is entered (such as through memset), the physical memory is actually allocated.

You may have doubts, if the process just calls malloc() or mmap() without writing these addresses, that is, without allocating physical memory to it, is there no need to worry about memory leaks?

The answer is that this still requires attention to memory leaks, because this may lead to the exhaustion of the process virtual address space, that is, the virtual address space also has the problem of memory leaks.

Next, we continue to use a picture to refine the pagination process.

As shown in the figure above, the general process of Paging is that the CPU passes the requested virtual address to the MMU (Memory Management Unit, memory management unit), and then the MMU first searches for the translation relationship in the cache TLB (Translation Lookaside Buffer, page table cache) , if the corresponding physical address is found, it will be accessed directly; if it cannot be found, it will be searched and calculated in the address translation table (Page Table). The virtual address accessed by the final process corresponds to the actual physical address.

After understanding the relevant knowledge of the address space, you can make a reasonable plan for the address space of the process, or control it reasonably. In this way, when a problem occurs, it will not have too serious an impact. You can understand the address space of the planned process as a solution to the process memory problem. The most typical way to plan the address space of a process on Linux is through ulimit. You can configure it to plan the maximum virtual address space, physical address space, stack space, and so on of the process.

Let's talk about the knowledge related to the process address space here first, and then let's see how to use tools to observe the address space of the process. 


Observe process memory with data

Learning to observe the process address space is the prerequisite for analyzing memory leaks. When you suspect a memory leak, you first need to observe which memory is growing continuously and which memory is particularly large, so that you can determine where the memory leak is roughly. Then Do analysis in a targeted manner; on the contrary, if you blindly guess where the problem lies before carefully observing the process address space, it is likely to waste a lot of time to deal with the problem, and even go the wrong way.

So what are the tools for observing the process? The tools we often use to observe process memory, such as pmap, ps, top, etc., can all observe process memory very well.

First, we can use top to observe the memory usage of all processes in the system. After opening top, press g and enter 3 to enter the memory mode. In the memory mode, we can see the %MEM, VIRT, RES, CODE, DATA, SHR, nMaj, nDRT of each process memory, these information track the top process through strace, you will find that these information are from /proc/ Read in the files [pid]/statm and /proc/[pid]/stat:


$ strace -p `pidof top`
open("/proc/16348/statm", O_RDONLY)     = 9
read(9, "40509 1143 956 24 0 324 0\n", 1024) = 26
close(9)                                = 0
...
open("/proc/16366/stat", O_RDONLY)      = 9
read(9, "16366 (kworker/u16:1-events_unbo"..., 1024) = 182
close(9)
...

Except for nMaj (Major Page Fault, the main page fault interrupt, which refers to the number of pages whose content is not in memory and then read from disk), %MEM is calculated from RES, and the rest of the memory information is from the statm file Read in it, the following is the correspondence between the fields in the top command and the fields in statm:

In addition, if you observe carefully, you may find that sometimes the sum of the RES of all processes is larger than the total physical memory of the system, because some memory in RES is shared by some processes.

After understanding the memory usage profile of each process in the system, if you want to continue to see the memory usage details of a process, you can use pmap. The following is pmap to show part of the address space of the sshd process: 


$  pmap -x `pidof sshd`
Address           Kbytes     RSS   Dirty Mode  Mapping 
000055e798e1d000     768     652       0 r-x-- sshd
000055e7990dc000      16      16      16 r---- sshd
000055e7990e0000       4       4       4 rw--- sshd
000055e7990e1000      40      40      40 rw---   [ anon ]
...
00007f189613a000    1800    1624       0 r-x-- libc-2.17.so
00007f18962fc000    2048       0       0 ----- libc-2.17.so
00007f18964fc000      16      16      16 r---- libc-2.17.so
00007f1896500000       8       8       8 rw--- libc-2.17.so
...
00007ffd9d30f000     132      40      40 rw---   [ stack ]
...

Each row represents a type of memory (Virtual Memory Area), and the meaning of each column is as follows.

  • Mapping, used to represent the memory-occupied files in the file mapping, such as the executable file sshd, or the heap [heap], or the stack [stack], or others, etc.
  • Mode, it is the permission of the memory, for example, "rx" is readable and executable, it is often a code segment (Text Segment); "rw-" is readable and writable, this part is often a data segment (Data Segment) ; "r–" is read-only, which is often the read-only portion of the data segment.
  • Address, Kbytes, RSS, Dirty, Address and Kbytes represent the starting address and the size of the virtual memory respectively, RSS (Resident Set Size) represents the size of the allocated physical memory in the virtual memory, and Dirty represents that the data in the memory is not synchronized to The number of bytes on disk.

It can be seen that through pmap, we can clearly observe the entire address space of a process, including the physical memory size allocated by them, which is very helpful for us to make a rough judgment on the memory usage profile of the process. For example, if [heap] in the address space is too large, there may be a leak in the heap memory; for another example, if the process address space contains too many vmas (each line in maps can be understood as a vma), It is likely that the application calls a lot of mmap without munmap; another example is to continuously observe the changes of the address space. If some items are found to be growing continuously, it is likely that there is a problem there.

pmap is also the file in the parsed /proc, the specific files are /proc/[pid]/maps and /proc/[pid]/smaps, where the smaps file is more detailed than the maps, which can be understood as the map an extension. You can compare the output of /proc/[pid]/maps and pmaps, and you will find that the contents of the two are consistent.

In addition to observing the memory of the process itself, we can also observe the relationship between the memory allocated by the process and the system indicators. Let's take the commonly used /proc/meminfo as an example to illustrate the four memory types we mentioned above (private anonymous, private Files, Shared Anonymous, Shared Files) are reflected in system metrics.

As shown in the figure above, all private memory will be reflected in the AnonPages item in /proc/meminfo, all shared memory will be reflected in the Cached item, and anonymous shared memory will also be reflected in the Shmem item.


$ cat /proc/meminfo
...
Cached:          3799380 kB
...
AnonPages:       1060684 kB
...
Shmem:              8724 kB
...

 Similarly, I also suggest that you write some test cases to observe, so that you can understand more deeply.

We will stop here with the basic knowledge related to the memory management of the process.

About the learning technology route of Linux kernel:

 Linux learning video materials:

 

Guess you like

Origin blog.csdn.net/qq_28581269/article/details/119460095