Linux内存泄漏分析

参考：https://blog.csdn.net/zhaowen_cug/article/details/77750973

在实际的项目中，最难缠的问题就是内存泄漏，当然还有panic之类的，内存泄漏分为两部分用户空间的和内核空间的.我们就分别从这两个层面分析一下.
用户空间查看内存泄漏和解决都相对简单。定位问题的方法和工具也很多相对容易.我们来看看.
1. 查看内存信息
cat /proc/meminfo、free、cat /proc/slabinfo等
2. 查看进程的状态信息
top、ps、cat /proc/pid/maps/status/fd等
通常我们定位问题先在shell下ps查看当前运行进程的状态，嵌入式上可能显示的信息会少一些.

[root@localhost kthread]# ps auxw|more

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

root 1 0.0 0.0 19396 1556 ? Ss May16 0:05 /sbin/init

我们可以很清晰看到VMZ和RSS的对比信息.VMZ就是这个进程申请的虚拟地址空间，而RSS是这个进程占用的实际物理内存空间.
通常一个进程如果有内存泄露VMZ会不断增大，相对的物理内存也会增加，如果是这样一般需要检查malloc/free是否匹配。根据进程ID我们可以查看详细的VMZ相关的信息。例：

1. root@hos-machine:~# cat /proc/1298/status

2. Name: sshd

3. State: S (sleeping)

4. Tgid: 1298

5. Ngid: 0

6. Pid: 1298

7. PPid: 1

8. TracerPid: 0

9. Uid: 0 0 0 0

10. Gid: 0 0 0 0

11. FDSize: 128

12. Groups:

13. NStgid: 1298

14. NSpid: 1298

15. NSpgid: 1298

16. NSsid: 1298

17. VmPeak: 65620 kB

18. VmSize: 65520 kB

19. VmLck: 0 kB

20. VmPin: 0 kB

21. VmHWM: 5480 kB

22. VmRSS: 5452 kB

23. VmData: 580 kB

24. VmStk: 136 kB

25. VmExe: 764 kB

26. VmLib: 8316 kB

27. VmPTE: 148 kB

28. VmPMD: 12 kB

29. VmSwap: 0 kB

30. HugetlbPages: 0 kB

31. Threads: 1

32. SigQ: 0/7814

33. SigPnd: 0000000000000000

34. ShdPnd: 0000000000000000

35. SigBlk: 0000000000000000

36. SigIgn: 0000000000001000

37. SigCgt: 0000000180014005

38. CapInh: 0000000000000000

39. CapPrm: 0000003fffffffff

40. CapEff: 0000003fffffffff

41. CapBnd: 0000003fffffffff

42. CapAmb: 0000000000000000

43. Seccomp: 0

44. Cpus_allowed: ffffffff,ffffffff

45. Cpus_allowed_list: 0-63

46. Mems_allowed: 00000000,00000001

47. Mems_allowed_list: 0

48. voluntary_ctxt_switches: 1307

49. nonvoluntary_ctxt_switches: 203

如果我们想查看这个进程打开了多少文件可以
[root@localhost kthread]# ls -l /proc/184172/fd/* | wc

214 2354 17293

顺带：（查看进程打开的所有文件）

[root@localhost kthread]# ls -l /proc/184172/fd/* |more

lr-x------. 1 root root 64 Jul 10 16:35 /proc/184172/fd/0 -> /dev/null

l-wx------. 1 root root 64 Jul 10 16:35 /proc/184172/fd/1 -> /usr/local/mm/mm.log

lrwx------. 1 root root 64 Jul 10 16:35 /proc/184172/fd/10 -> socket:[13690382]

lrwx------. 1 root root 64 Jul 10 16:35 /proc/184172/fd/100 -> socket:[13690549]

查看进程详细的内存映射信息

cat /proc/7393/maps

[root@localhost kthread]# cat /proc/184172/maps

00400000-00455000 r-xp 00000000 fd:05 1441820 /usr/local/mm/ms

00654000-00655000 rw-p 00054000 fd:05 1441820 /usr/local/mm/ms

00655000-00658000 rw-p 00000000 00:00 0

02205000-1a218000 rw-p 00000000 00:00 0 [heap]

3000600000-3000603000 r-xp 00000000 fd:00 143299 /lib64/libcom_err.so.2.1

3000603000-3000802000 ---p 00003000 fd:00 143299 /lib64/libcom_err.so.2.1

我们看一下meminfo各个注释：参考documentation/filesystem/proc.txt

1. MemTotal: Total usable ram (i.e. physical ram minus a few reserved bits and the kernel binary code)

2. MemFree: The sum of LowFree+HighFree

3. Buffers: Relatively temporary storage for raw disk blocks shouldn't get tremendously large (20MB or so)

4. Cached: in-memory cache for files read from the disk (the pagecache). Doesn't include

5. SwapCached SwapCached: Memory that once was swapped out, is swapped back in but still also is in the swapfile (if memory is needed it

6. doesn't need to be swapped out AGAIN because it is already in the swapfile. This saves I/O)

7. Active: Memory that has been used more recently and usually not reclaimed unless absolutely necessary.

8. Inactive: Memory which has been less recently used. It is more eligible to be reclaimed for other purposes

9. HighTotal:

10. HighFree: Highmem is all memory above ~860MB of physical memory Highmem areas are for use by userspace programs, or

11. for the pagecache. The kernel must use tricks to access this memory, making it slower to access than lowmem.

12. LowTotal:

13. LowFree: Lowmem is memory which can be used for everything that highmem can be used for, but it is also available for the

14. kernel's use for its own data structures. Among many other things, it is where everything from the Slab is

15. allocated. Bad things happen when you're out of lowmem.

16. SwapTotal: total amount of swap space available

17. SwapFree: Memory which has been evicted from RAM, and is temporarily on the disk

18. Dirty: Memory which is waiting to get written back to the disk

19. Writeback: Memory which is actively being written back to the disk

20. AnonPages: Non-file backed pages mapped into userspace page tables

21. AnonHugePages: Non-file backed huge pages mapped into userspace page tables

22. Mapped: files which have been mmaped, such as libraries

23. Slab: in-kernel data structures cache

24. SReclaimable: Part of Slab, that might be reclaimed, such as caches

25. SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure

26. PageTables: amount of memory dedicated to the lowest level of page tables.

27. NFS_Unstable: NFS pages sent to the server, but not yet committed to stable storage

28. Bounce: Memory used for block device "bounce buffers"

29. WritebackTmp: Memory used by FUSE for temporary writeback buffers

30. CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'), this is the total amount of memory currently available to

31. be allocated on the system. This limit is only adhered to if strict overcommit accounting is enabled (mode 2 in

32. 'vm.overcommit_memory').

33. The CommitLimit is calculated with the following formula: CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap

34. For example, on a system with 1G of physical RAM and 7G

35. of swap with a `vm.overcommit_ratio` of 30 it would

36. yield a CommitLimit of 7.3G.

37. For more details, see the memory overcommit documentation in vm/overcommit-accounting.

38. Committed_AS: The amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which

39. has been allocated by processes, even if it has not been

40. "used" by them as of yet. A process which malloc()'s 1G

41. of memory, but only touches 300M of it will only show up as using 300M of memory even if it has the address space

42. allocated for the entire 1G. This 1G is memory which has been "committed" to by the VM and can be used at any time

43. by the allocating application. With strict overcommit enabled on the system (mode 2 in 'vm.overcommit_memory'),

44. allocations which would exceed the CommitLimit (detailed above) will not be permitted. This is useful if one needs

45. to guarantee that processes will not fail due to lack of memory once that memory has been successfully allocated.

46. VmallocTotal: total size of vmalloc memory area

47. VmallocUsed: amount of vmalloc area which is used

48. VmallocChunk: largest contiguous block of vmalloc area which is free

我们只需要关注几项就ok. buffers/cache/slab/active/anonpages

Active= Active(anon) + Active(file) (同样Inactive)
AnonPages: Non-file backed pages mapped into userspace page tables\
buffers和cache的区别注释说的很清楚了.
有时候不是内存泄露，同样也会让系统崩溃，比如cache、buffers等占用的太多，打开太多文件，而等待系统自动回收是一个非常漫长的过程.
从proc目录下的meminfo文件了解到当前系统内存的使用情况汇总，其中可用的物理内存=memfree+buffers+cached，当memfree不够时，内核会通过
回写机制(pdflush线程)把cached和buffered内存回写到后备存储器，从而释放相关内存供进程使用，或者通过手动方式显式释放cache内存

点击(此处)折叠或打开

1. drop_caches

2. Writing to this will cause the kernel to drop clean caches, dentries and inodes from memory, causing that memory to become free.

3. To free pagecache:

4. echo 1 > /proc/sys/vm/drop_caches

5. To free dentries and inodes:

6. echo 2 > /proc/sys/vm/drop_caches

7. To free pagecache, dentries and inodes:

8. echo 3 > /proc/sys/vm/drop_caches

9. As this is a non-destructive operation and dirty objects are not freeable, the user should run `sync`first

用户空间内存检测也可以通过mtrace来检测用法也非常简单，之前文章我们有提到过. 包括比较有名的工具valgrind、以及dmalloc、memwatch等.各有特点.

内核内存泄露的定位比较复杂，先判断是否是内核泄露了，然后在具体定位什么操作，然后再排查一些可疑的模块，内核内存操作基本都是kmalloc
即通过slab/slub/slob机制，所以如果meminfo里slab一直增长那么很有可能是内核的问题.我们可以更加详细的查看slab信息
cat /proc/slabinfo
如果支持slabtop更好，基本可以判断内核是否有内存泄漏，并且是在操作什么对象的时候发生的。

点击(此处)折叠或打开

1. cat /proc/slabinfo

2. slabinfo - version: 2.1

3. # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>

4. fuse_request 0 0 288 28 2 : tunables 0 0 0 : slabdata 0 0 0

5. fuse_inode 0 0 448 18 2 : tunables 0 0 0 : slabdata 0 0 0

6. fat_inode_cache 0 0 424 19 2 : tunables 0 0 0 : slabdata 0 0 0

7. fat_cache 0 0 24 170 1 : tunables 0 0 0 : slabdata 0 0 0

在内核的配置中里面已经支持了一部分memleak自动检查的选项，可以打开来进行跟踪调试.
这里没有深入的东西，算是抛砖引玉吧~.

Linux内存泄漏分析

猜你喜欢