When there is a memory leak in the kernel, if it cannot be effectively analyzed by kmemleak or if it is suspected that the slab has a memory leak, you can use the slabinfo information for debugging.
Configure slub debug
The slub debug function needs to enable related configuration items in the kernel before it can be used. The main configuration items are as follows:
CONFIG_SLUB_DEBUG
CONFIG_SLUB_DEBUG_ON
Other dependent configurations:
CONFIG_SYSFS
slabinfo information
After completing the configuration, you will see that there is /proc/slabinfo
a node on the device side. Through this node, we can see the information about different interfaces requesting memory:
root@Linux:/# cat /proc/slabinfo
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
fuse_request 0 0 744 22 4 : tunables 0 0 0 : slabdata 0 0 0
fuse_inode 0 0 1152 14 4 : tunables 0 0 0 : slabdata 0 0 0
ubi_wl_entry_slab 490 506 368 22 2 : tunables 0 0 0 : slabdata 23 23 0
ubifs_inode_slab 0 0 1032 15 4 : tunables 0 0 0 : slabdata 0 0 0
bridge_fdb_cache 0 0 512 16 2 : tunables 0 0 0 : slabdata 0 0 0
ip6-frags 0 0 536 15 2 : tunables 0 0 0 : slabdata 0 0 0
fib6_nodes 0 0 512 16 2 : tunables 0 0 0 : slabdata 0 0 0
ip6_dst_cache 0 0 832 19 4 : tunables 0 0 0 : slabdata 0 0 0
ip6_mrt_cache 0 0 576 14 2 : tunables 0 0 0 : slabdata 0 0 0
RAWv6 6 22 1472 22 8 : tunables 0 0 0 : slabdata 1 1 0
UDPLITEv6 0 0 1472 22 8 : tunables 0 0 0 : slabdata 0 0 0
UDPv6 0 0 1472 22 8 : tunables 0 0 0 : slabdata 0 0 0
tw_sock_TCPv6 0 0 560 14 2 : tunables 0 0 0 : slabdata 0 0 0
request_sock_TCPv6 0 0 624 13 2 : tunables 0 0 0 : slabdata 0 0 0
TCPv6 0 0 2432 13 8 : tunables 0 0 0 : slabdata 0 0 0
nf_conntrack_expect 0 0 560 14 2 : tunables 0 0 0 : slabdata 0 0 0
nf_conntrack 0 0 640 12 2 : tunables 0 0 0 : slabdata 0 0 0
aipu_job_cache 1 15 536 15 2 : tunables 0 0 0 : slabdata 1 1 0
sd_ext_cdb 2 22 368 22 2 : tunables 0 0 0 : slabdata 1 1 0
sgpool-128 2 7 4544 7 8 : tunables 0 0 0 : slabdata 1 1 0
sgpool-64 2 13 2496 13 8 : tunables 0 0 0 : slabdata 1 1 0
sgpool-32 2 22 1472 22 8 : tunables 0 0 0 : slabdata 1 1 0
sgpool-16 2 17 960 17 4 : tunables 0 0 0 : slabdata 1 1 0
sgpool-8 2 23 704 23 4 : tunables 0 0 0 : slabdata 1 1 0
cfq_io_cq 0 0 448 18 2 : tunables 0 0 0 : slabdata 0 0 0
cfq_queue 0 0 576 14 2 : tunables 0 0 0 : slabdata 0 0 0
fat_inode_cache 0 0 1032 15 4 : tunables 0 0 0 : slabdata 0 0 0
fat_cache 0 0 368 22 2 : tunables 0 0 0 : slabdata 0 0 0
squashfs_inode_cache 312 320 1024 16 4 : tunables 0 0 0 : slabdata 20 20 0
dnotify_mark 0 0 424 19 2 : tunables 0 0 0 : slabdata 0 0 0
dnotify_struct 0 0 368 22 2 : tunables 0 0 0 : slabdata 0 0 0
dio 0 0 1152 14 4 : tunables 0 0 0 : slabdata 0 0 0
fasync_cache 0 0 384 21 2 : tunables 0 0 0 : slabdata 0 0 0
posix_timers_cache 0 0 560 14 2 : tunables 0 0 0 : slabdata 0 0 0
UNIX 13 20 1600 20 8 : tunables 0 0 0 : slabdata 1 1 0
ip4-frags 0 0 552 14 2 : tunables 0 0 0 : slabdata 0 0 0
ip_mrt_cache 0 0 576 14 2 : tunables 0 0 0 : slabdata 0 0 0
UDP-Lite 0 0 1344 12 4 : tunables 0 0 0 : slabdata 0 0 0
tcp_bind_bucket 1 18 448 18 2 : tunables 0 0 0 : slabdata 1 1 0
inet_peer_cache 0 0 576 14 2 : tunables 0 0 0 : slabdata 0 0 0
ip_fib_trie 0 0 384 21 2 : tunables 0 0 0 : slabdata 0 0 0
ip_fib_alias 0 0 392 20 2 : tunables 0 0 0 : slabdata 0 0 0
ip_dst_cache 0 0 640 12 2 : tunables 0 0 0 : slabdata 0 0 0
RAW 4 12 1280 12 4 : tunables 0 0 0 : slabdata 1 1 0
UDP 1 24 1344 12 4 : tunables 0 0 0 : slabdata 2 2 0
tw_sock_TCP 0 0 560 14 2 : tunables 0 0 0 : slabdata 0 0 0
request_sock_TCP 0 0 624 13 2 : tunables 0 0 0 : slabdata 0 0 0
TCP 1 14 2240 14 8 : tunables 0 0 0 : slabdata 1 1 0
eventpoll_pwq 11 20 408 20 2 : tunables 0 0 0 : slabdata 1 1 0
eventpoll_epi 11 28 576 14 2 : tunables 0 0 0 : slabdata 2 2 0
inotify_inode_mark 0 0 424 19 2 : tunables 0 0 0 : slabdata 0 0 0
scsi_data_buffer 0 0 360 22 2 : tunables 0 0 0 : slabdata 0 0 0
request_queue 1 14 2240 14 8 : tunables 0 0 0 : slabdata 1 1 0
blkdev_requests 0 0 696 23 4 : tunables 0 0 0 : slabdata 0 0 0
blkdev_ioc 5 18 440 18 2 : tunables 0 0 0 : slabdata 1 1 0
bio-0 4 12 640 12 2 : tunables 0 0 0 : slabdata 1 1 0
biovec-max 4 7 4544 7 8 : tunables 0 0 0 : slabdata 1 1 0
biovec-128 0 13 2496 13 8 : tunables 0 0 0 : slabdata 1 1 0
biovec-64 0 22 1472 22 8 : tunables 0 0 0 : slabdata 1 1 0
biovec-16 0 23 704 23 4 : tunables 0 0 0 : slabdata 1 1 0
uid_cache 0 0 512 16 2 : tunables 0 0 0 : slabdata 0 0 0
dmaengine-unmap-2 1 18 448 18 2 : tunables 0 0 0 : slabdata 1 1 0
sock_inode_cache 38 48 1024 16 4 : tunables 0 0 0 : slabdata 3 3 0
skbuff_fclone_cache 0 0 832 19 4 : tunables 0 0 0 : slabdata 0 0 0
skbuff_head_cache 1 24 640 12 2 : tunables 0 0 0 : slabdata 2 2 0
configfs_dir_cache 30 38 424 19 2 : tunables 0 0 0 : slabdata 2 2 0
file_lock_cache 0 15 544 15 2 : tunables 0 0 0 : slabdata 1 1 0
file_lock_ctx 4 20 392 20 2 : tunables 0 0 0 : slabdata 1 1 0
shmem_inode_cache 243 256 984 16 4 : tunables 0 0 0 : slabdata 16 16 0
proc_inode_cache 121 128 968 16 4 : tunables 0 0 0 : slabdata 8 8 0
sigqueue 0 16 496 16 2 : tunables 0 0 0 : slabdata 1 1 0
bdev_cache 2 13 1216 13 4 : tunables 0 0 0 : slabdata 1 1 0
kernfs_node_cache 16881 16881 456 17 2 : tunables 0 0 0 : slabdata 993 993 0
mnt_cache 24 38 832 19 4 : tunables 0 0 0 : slabdata 2 2 0
filp 150 240 640 12 2 : tunables 0 0 0 : slabdata 20 20 0
inode_cache 2814 2826 896 18 4 : tunables 0 0 0 : slabdata 157 157 0
dentry 3652 3660 528 15 2 : tunables 0 0 0 : slabdata 244 244 0
names_cache 0 35 4544 7 8 : tunables 0 0 0 : slabdata 5 5 0
buffer_head 5292 5292 440 18 2 : tunables 0 0 0 : slabdata 294 294 0
nsproxy 1 20 392 20 2 : tunables 0 0 0 : slabdata 1 1 0
vm_area_struct 386 520 792 20 4 : tunables 0 0 0 : slabdata 26 26 0
mm_struct 10 26 1216 13 4 : tunables 0 0 0 : slabdata 2 2 0
fs_cache 11 36 448 18 2 : tunables 0 0 0 : slabdata 2 2 0
files_cache 11 26 1216 13 4 : tunables 0 0 0 : slabdata 2 2 0
signal_cache 76 115 1408 23 8 : tunables 0 0 0 : slabdata 5 5 0
sighand_cache 76 91 2496 13 8 : tunables 0 0 0 : slabdata 7 7 0
task_struct 81 90 3456 9 8 : tunables 0 0 0 : slabdata 10 10 0
cred_jar 88 112 512 16 2 : tunables 0 0 0 : slabdata 7 7 0
anon_vma_chain 209 320 400 20 2 : tunables 0 0 0 : slabdata 16 16 0
anon_vma 131 220 408 20 2 : tunables 0 0 0 : slabdata 11 11 0
pid 80 96 512 16 2 : tunables 0 0 0 : slabdata 6 6 0
kmemleak_scan_area 4035 4048 368 22 2 : tunables 0 0 0 : slabdata 184 184 0
kmemleak_object 55398 55800 640 12 2 : tunables 0 0 0 : slabdata 4650 4650 0
radix_tree_node 216 221 912 17 4 : tunables 0 0 0 : slabdata 13 13 0
pool_workqueue 4 16 1024 16 4 : tunables 0 0 0 : slabdata 1 1 0
idr_layer_cache 191 195 2432 13 8 : tunables 0 0 0 : slabdata 15 15 0
task_group 4 18 896 18 4 : tunables 0 0 0 : slabdata 1 1 0
dma-kmalloc-8192 0 0 8704 3 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-4096 0 0 4608 7 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-2048 0 0 2560 12 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-1024 0 0 1536 21 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-512 0 0 1024 16 4 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-256 0 0 768 21 4 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-128 0 0 640 12 2 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-8192 10 12 8704 3 8 : tunables 0 0 0 : slabdata 4 4 0
kmalloc-4096 39 42 4608 7 8 : tunables 0 0 0 : slabdata 6 6 0
kmalloc-2048 116 120 2560 12 8 : tunables 0 0 0 : slabdata 10 10 0
kmalloc-1024 468 483 1536 21 8 : tunables 0 0 0 : slabdata 23 23 0
kmalloc-512 550 560 1024 16 4 : tunables 0 0 0 : slabdata 35 35 0
kmalloc-256 1390 1428 768 21 4 : tunables 0 0 0 : slabdata 68 68 0
kmalloc-128 19391 19404 640 12 2 : tunables 0 0 0 : slabdata 1617 1617 0
kmem_cache_node 113 120 640 12 2 : tunables 0 0 0 : slabdata 10 10 0
kmem_cache 113 126 768 21 4 : tunables 0 0 0 : slabdata 6 6 0
From the above alsbinfo information, we can know the amount of memory requested by different APIs. If there is a memory leak, we will be able to compare the above information to know which API was called to apply for memory and finally leaked.
slabinfo troubleshooting
Suppose we compare the two times and find that the num_objs of kmalloc-128 has increased significantly compared with before. Then we first suspect that there is a leak after the driver applies for memory through kmalloc-128.
At this time, you can use cat /sys/kernel/slab/kmalloc-128/alloc_calls
the command to know which driver APIs have applied for kmalloc-128. It is also by comparing the information of the node before and after, so as to guide the APIs with obvious differences, so as to locate the problem.
in executioncat /sys/kernel/slab/kmalloc-128/alloc_calls
When ordering, it is often impossible to print completely due to too many APIs applying for memory. At this time, I often add a patch by myself:
diff --git a/mm/slub.c b/mm/slub.c
index bdbb20631ca8..95ee1e7ba3bc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4552,6 +4552,9 @@ static int list_locations(struct kmem_cache *s, char *buf,
for (i = 0; i < t.count; i++) {
struct location *l = &t.loc[i];
+ if (l->count < 1000)
+ continue;
+
if (len > PAGE_SIZE - KSYM_SYMBOL_LEN - 100) {
print_buf = true;
pr_err("%s\n", buf);
Add the above patch so that if the slab object applied for through a certain API call is less than 1000, it will not be printed, so that the difference before and after can be clearly compared (the leakage is caused by repeated calls to a certain API, and the difference can be clearly seen, and the threshold can be adjusted by itself) .
After getting an API call, find out which driver calls the API, and check step by step.
Capture the trace information of the slab object
After grabbing the alloc_calls and free_calls analysis of the slab object, it may not be possible to effectively analyze the problem. At this time, you need to grab the trace information of the slab object of the leaked scenario for analysis. Use the following command to grab the trace information:
echo 0 > /proc/sys/kernel/printk //先设置打印等级为0,否则串口控制台log输出过多
echo 1 >/sys/kernel/slab/kmalloc-128/trace;cat /proc/kmsg > /tmp/slab_trace.txt
Analyze the captured slab_trace.txt file and observe whether the alloc and free of the slab object are normal. This method may cause the kmsg log to be overwritten. The log buf is not large enough. You can set the size of the log buf by setting the value of CONFIG_LOG_BUF_SHIFT:
17 => 128 KB
16 => 64 KB
15 => 32 KB
14 => 16 KB
13 => 8 KB
12 => 4 KB
You can get the place where the slab object is applied for through slab_trace.txt, and then find out which driver in the kernel calls the API, and check step by step.