Operating system-Linux kernel deadlock detection (write a simple deadlock demo)

Problem Description

    During the actual product operation, the Linux system freezes, the screen has no valid serial port print information, the network is interrupted, and the keyboard and mouse do not respond.
    This failure phenomenon may be caused by a deadlock in the Linux kernel. Since there is no valid printing information and there is no record in the kernel log, it is impossible to locate the root cause of the fault.
    
 How to make the Linux kernel print relevant information before the deadlock is particularly critical for problem location. One of the effective methods is to turn on the "Kernel Hacking" option, and then recompile the kernel. Several configuration options that are helpful for Linux (3.14.28) kernel deadlock are:

   Kernel hacking  --->Debug Lockups and Hangs  --->      

                               [*] Detect Hard and Soft Lockups                                                                                     
                               [*]   Panic (Reboot) On Soft Lockups             
                               [*] Detect Hung Tasks                                                                                                 
                              (120) Default timeout for hung task detection (in seconds)                                      
                               [*]   Panic (Reboot) On Hung Tasks   

soft lockup 和 hard lockup?

Soft lockup: The preemption is closed for a long time and the process cannot be scheduled.
Hard lockup: Interrupts are closed for a long time and cause more serious problems.
The core detection principle of this lockup will be analyzed later.

lockdep deadlock detection module

    Introduced the simplest form of ABBA deadlock, back to the topic, back to the kernel, there are thousands of locks in it, which are complicated, and it is impossible to require all developers to be familiar with the difference between spin_lock, spin_lock_irq, spin_lock_irqsave, spin_lock_nested. Therefore, before the lock-up occurs, it is necessary to do a good job of prevention rather than treatment, and prevent the problem before it occurs, try to find out in advance and discover and solve the potential deadlock risk in the development stage in advance, instead of waiting until the actual death occurs. The lock time brings a bad experience to the user. The lockdep deadlock detection module came into being, which was introduced into the kernel in 2006 (https://lwn.net/Articles/185666/).

相关内核配置选项
    CONFIG_DEBUG_RT_MUTEXES=y
    检测rt mutex的死锁,并自动报告死锁现场信息。

    CONFIG_DEBUG_SPINLOCK=y
    检测spinlock的未初始化使用等问题。配合NMI watchdog使用,能发现spinlock死锁。

    CONFIG_DEBUG_MUTEXES=y
    检测并报告mutex错误

    CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
    检测wait/wound类型mutex的slowpath测试。
     
    CONFIG_DEBUG_LOCK_ALLOC=y
    检测使用中的锁(spinlock/rwlock/mutex/rwsem)被释放,或者使用中的锁被重新初始化,或者在进程退出时持有锁。
     
    CONFIG_PROVE_LOCKING=y
    使内核能在死锁发生前报告死锁详细信息。参见/proc/lockdep_chains。

    CONFIG_LOCKDEP=y
    整个Lockdep的总开关。参见/proc/lockdep、/proc/lockdep_stats。

    CONFIG_LOCK_STAT=y
    记锁持有竞争区域的信息,包括等待时间、持有时间等等信息。参见/proc/lock_stat。

    CONFIG_DEBUG_LOCKDEP=y
    会对Lockdep的使用过程中进行更多的自我检测,会增加很多额外开销。

    CONFIG_DEBUG_ATOMIC_SLEEP=y
    在atomic section中睡眠可能造成很多不可预测的问题,这些atomic section包括spinlock持锁、rcu读操作、禁止内核抢占部分、中断处理中等等。

Deadlock problem analysis

    A deadlock is a state in which multiple processes (threads) are blocked because they are waiting for other processes to occupy the resources they need.
    Once the deadlock state is formed, the process itself cannot be solved, and it needs external promotion to be able to The most important thing to solve is that the deadlock not only affects the process business, but also takes up system resources and affects other processes.
    Therefore, a kernel deadlock detection mechanism is designed in the kernel. Once a deadlock process is found, the OS will be restarted to solve the problem quickly.
    The reason for using restart tactics is that the distributed system can tolerate single-point crashes and cannot tolerate single-point process calculation abnormalities. Otherwise, deadlock detection and restarting the OS will not be worth the loss.

 

What is deadlock

    Mutual exclusion locks protect critical resources from being mutually exclusive between threads (or between processes). When a thread gets the lock and does not release it, another thread must wait when applying for it.
    When multiple threads compete for resources and cause a deadlock (waiting for each other), these processes will wait forever if they don't help.
    Simply put: it is a system state in which two or more processes wait endlessly and the condition will never be established.

 

Causes of deadlock

    ① Insufficient system resources: The number of resources in the system is not enough to meet the needs of thread operation, making it deadlocked due to resource competition during the operation.
    ② Illegal advancement sequence between threads: The order of application and release between threads is illegal during operation.
    ③ Improper resource allocation.

 

Four conditions for deadlock

    Mutually exclusive condition: Within a period of time, a certain resource can only be occupied by one thread. If other threads apply at this time, they must wait until the current thread is used up and released.
    Inalienable condition: The thread cannot be preempted before the resource has been used up. It can only release the
    request and hold condition after using it . The thread already owns a resource, and at this time it applies for a new resource. If the new resource is Other threads occupy, the current thread will block and wait, but it will keep the resources it has obtained. The
    waiting condition of the loop: When a deadlock occurs, there must be a loop chain, that is, the thread set {T0, T1, T2,..., Tn } T0 is waiting for the resources occupied by T1, T1 is waiting for the resources occupied by T2,..., Tn is waiting for the resources occupied by Tn

 

Common mistakes

    AA: Repeatedly lock
    ABBA: Have used AB sequence to lock, and then used BA to lock
    ABBCCA: This type is an extension of ABBA. AB order, AB order, CA order. This kind of lock is difficult to find manually.
    Unlock multiple times

 

Deadlock detection and recovery

    For resource
        preemption, resource recovery
        rolls back to a safe state, recovery
    for process
        kills, process recovery

    
Reference article:
    https://blog.csdn.net/ccwzhu/article/details/81171092

One, D state deadlock detection

The so-called D state deadlock: The process is in the TASK_UNINTERRUPTIBLE sleep state for a long time (120 seconds by default). In this state, the process does not respond to asynchronous signals. Such as: the interaction between the process and the peripheral hardware (such as read), this state is usually used to ensure that the interaction between the process and the device is not interrupted, otherwise the device may be in an uncontrollable state.

The kernel D state deadlock detection is the hung_task mechanism, and the main code is in the kernel/hung_task.c file.

Specific realization principle:

1. Create a normal-level khungtaskd kernel thread, check every sysctl_hung_task_timeout_secs time in the infinite loop, and use schedule_timeout to time (save the CPU wasted by the timer).

2. Call the do_each_thread, while_each_thread macro to traverse all process information. If there is a D state process, check whether the number of recent switching is consistent with the task calculation, that is, whether there has been a scheduling switch recently. If they are consistent, there is no switching. Print the relevant information. The sysctl_hung_task_panic switch determines whether to restart.

The proc interfaces corresponding to user mode control are:
/proc/sys/kernel/hung_task_timeout_secs, hung_task_panic, etc.        
    
    
Two, R state deadlock detection

The so-called R state deadlock: the process is in the TASK_RUNNING state for a long time (60 seconds by default) to monopolize the cpu without switching. Generally, the process is closed and preempted for a long time to work. Sometimes it may be in an infinite loop after the process is closed and preempted. After sleep, this causes system abnormalities.

Added: lockdep is not a so-called deadlock.

The kernel R state deadlock detection mechanism is the lockdep mechanism, and the entry is the lockup_detector_init function.

1. Call watchdog_enable through the cpu_callback function to create a real-time thread watchdog at the SCHED_FIFO level on each CPU core, which uses the hrtimer timer to control the check cycle.

2. The hrtimer timer calls watchdog_timer_fn to check the dog clearing time, and the thread resets the dog clearing time each time. If watchdog_timer_fn finds that the dog’s reset time has a dangerous value different from the current time, panic processing is performed according to the switch.

The proc interfaces corresponding to user mode control are:

/proc/sys/kernel/watchdog_thresh,softlockup_panic等。

 

View lock information

    /proc/sys/kernel/lock_stat  /* 置位则可以查看/proc/lock_stat统计信息,清除则关闭lockdep统计信息。 */
    /proc/lock_stat             /* 关于锁的使用统计信息 */
    /proc/lockdep               /* 存在依赖关系的锁 */
    /proc/lockdep_stats         /* 存在依赖关系锁的统计信息 */
    /proc/lockdep_chains        /* 依赖关系锁链表 */
    /proc/locks                 /*  */
    /proc/sys/kernel/prove_locking
    /proc/sys/kernel/max_lock_depth
    /sys/kernel/debug/tracing/events/lock  /* 内核提供了Tracepoint协助发现锁的使用问题 */

Write a simple deadlock demo

#include <linux/module.h>
#include <linux/kernel.h>

static spinlock_t hack_spinA;
static spinlock_t hack_spinB;

void hack_spinAB(void)
{
  printk("hack_lockdep:A->B\n");
  spin_lock(&hack_spinA);
  spin_lock(&hack_spinB);
}

void hack_spinBA(void)
{
  printk("hack_lockdep:B->A\n");
  spin_lock(&hack_spinB);
  spin_lock(&hack_spinA);
}

static int __init lockdep_test_init(void)
{
  printk("al: lockdep error test init\n");
  spin_lock_init(&hack_spinA);
  spin_lock_init(&hack_spinB);

  hack_spinAB();
  hack_spinBA();
  return 0;
}

module_init(lockdep_test_init);

Corresponding Makefile

obj-m:=spin_lock_deadlock.o

#声明当前的架构,这里用户需要根据实际情况选择架构类型
export ARCH=arm
#声明交叉编译工具链,这里用户需要根据实际情况选择对应的工具链
export CROSS_COMPILE=arm-himix200-linux-

#源码目录变量,这里用户需要根据实际情况选择路径
#下面作者是将Linux的源码目录
KERDIR := /home1/zhugeyifan/source/K5/3519av100/packages/linux_lsp/kernel/linux-4.9.37/

#当前目录变量
CURDIR := $(shell pwd)

#make命名默认寻找第一个目标
#make -C就是指调用执行的路径
#$(KERDIR)Linux源码目录,作者这里指的是.../linux-4.9.37/
#$(CURDIR)当前目录变量
#modules要执行的操作
#注意!假如复制到Makefile后,下行提示红色,把 all:下面一行的make前面的空格删除后添加(Tab)制表符
all:
	make -C $(KERDIR) M=$(CURDIR) modules

#make clean执行的操作是删除后缀为o的文件
#注意!假如复制到Makefile后,下行提示红色,把 clean:下面一行的make前面的空格删除后添加(Tab)制表符
clean:
	make -C $(KERDIR) M=$(CURDIR) clean

If you report an error, I suggest you read this article: Operating System-Linux compile a simple ko module

 

Generate

Generate the corresponding ko module:

spin_lock_deadlock.ko

 

carried out

Test whether the above deadlock detection module works

Execute and report an error, which means that the deadlock detection module has taken effect:

watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [insmod:306]
Modules linked in: spin_lock_deadlock(PO+) hi_user(O) hi_mipi_rx(O) hi3519av100_acodec(PO) hi3519av100_adec(PO) hi3519av100_aenc(PO) hi3519av100_ao(PO) hi3519av100_ai(PO) hi3519av100_aio(PO) hi3519av100_hdmi(PO) hi_sensor_spi(O) hi_sensor_i2c(O) hi_piris(O) hi_pwm(O) hi3519av100_dpu_match(PO) hi3519av100_dpu_rect(PO) hi3519av100_dsp(PO) hi3519av100_nnie(PO) hi_ipcm(O) hi3519av100_ive(PO) hi3519av100_vdec(PO) hi3519av100_vfmw(PO) hi3519av100_jpegd(PO) hi3519av100_jpege(PO) hi3519av100_h265e(PO) hi3519av100_h264e(PO) hi3519av100_vedu(PO) hi3519av100_chnl(PO) hi3519av100_venc(PO) hi3519av100_rc(PO) hifb(O) hi3519av100_vo(PO) hi3519av100_avs(PO) hi3519av100_vpss(PO) hi3519av100_vi(PO) hi3519av100_isp(PO) hi3519av100_dis(PO) hi3519av100_vgs(PO) hi3519av100_gdc(PO) hi3519av100_rgn(PO) hi3519av100_tde(PO) hi3519av100_sys(PO) hi3519av100_base(PO) hi_osal(O) sys_config(O)
CPU: 0 PID: 306 Comm: insmod Tainted: P           O    4.9.37 #402
Hardware name: Generic DT based system
task: d47ace40 task.stack: d470c000
PC is at _raw_spin_lock+0x2c/0x40
LR is at hack_spinBA+0x20/0x2c [spin_lock_deadlock]
pc : [<c06a4a04>]    lr : [<bf00204c>]    psr: 80000013
sp : d470ddf8  ip : 00000000  fp : 00000024
r10: 00000000  r9 : bf002100  r8 : 00000000
r7 : d47a3000  r6 : d47a3ec0  r5 : ffffe000  r4 : bf002300
r3 : 00000000  r2 : 00000001  r1 : 00000000  r0 : bf002304
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5383d  Table: 238dc06a  DAC: 00000051
CPU: 0 PID: 306 Comm: insmod Tainted: P           O    4.9.37 #402
Hardware name: Generic DT based system
[<c010f8b8>] (unwind_backtrace) from [<c010b4e0>] (show_stack+0x10/0x14)
[<c010b4e0>] (show_stack) from [<c0380bbc>] (dump_stack+0x84/0x98)
[<c0380bbc>] (dump_stack) from [<c0194ff8>] (watchdog_timer_fn+0x248/0x2c0)
[<c0194ff8>] (watchdog_timer_fn) from [<c016f534>] (__hrtimer_run_queues+0x128/0x1bc)
[<c016f534>] (__hrtimer_run_queues) from [<c016f71c>] (hrtimer_interrupt+0xa8/0x204)
[<c016f71c>] (hrtimer_interrupt) from [<c054b594>] (arch_timer_handler_phys+0x28/0x30)
[<c054b594>] (arch_timer_handler_phys) from [<c0163688>] (handle_percpu_devid_irq+0x74/0x134)
[<c0163688>] (handle_percpu_devid_irq) from [<c015eb30>] (generic_handle_irq+0x24/0x34)
[<c015eb30>] (generic_handle_irq) from [<c015f054>] (__handle_domain_irq+0x5c/0xb4)
[<c015f054>] (__handle_domain_irq) from [<c0101438>] (gic_handle_irq+0x48/0x8c)
[<c0101438>] (gic_handle_irq) from [<c010bfcc>] (__irq_svc+0x6c/0x90)
Exception stack(0xd470dda8 to 0xd470ddf0)
dda0:                   bf002304 00000000 00000001 00000000 bf002300 ffffe000
ddc0: d47a3ec0 d47a3000 00000000 bf002100 00000000 00000024 00000000 d470ddf8
dde0: bf00204c c06a4a04 80000013 ffffffff
[<c010bfcc>] (__irq_svc) from [<c06a4a04>] (_raw_spin_lock+0x2c/0x40)
[<c06a4a04>] (_raw_spin_lock) from [<bf00204c>] (hack_spinBA+0x20/0x2c [spin_lock_deadlock])
[<bf00204c>] (hack_spinBA [spin_lock_deadlock]) from [<d47a3ec0>] (0xd47a3ec0)
INFO: rcu_sched self-detected stall on CPU
	0-...: (59672 ticks this GP) idle=4c3/140000000000001/0 softirq=34656/34656 fqs=14660 
	 (t=60000 jiffies g=2778 c=2777 q=166)
Task dump for CPU 0:
insmod          R  running task        0   306    218 0x00000003
[<c010f8b8>] (unwind_backtrace) from [<c010b4e0>] (show_stack+0x10/0x14)
[<c010b4e0>] (show_stack) from [<c019857c>] (rcu_dump_cpu_stacks+0xa8/0xc4)
[<c019857c>] (rcu_dump_cpu_stacks) from [<c016bf54>] (rcu_check_callbacks+0x714/0x890)
[<c016bf54>] (rcu_check_callbacks) from [<c016e780>] (update_process_times+0x34/0x5c)
[<c016e780>] (update_process_times) from [<c017eaf0>] (tick_sched_timer+0x68/0x268)
[<c017eaf0>] (tick_sched_timer) from [<c016f534>] (__hrtimer_run_queues+0x128/0x1bc)
[<c016f534>] (__hrtimer_run_queues) from [<c016f71c>] (hrtimer_interrupt+0xa8/0x204)
[<c016f71c>] (hrtimer_interrupt) from [<c054b594>] (arch_timer_handler_phys+0x28/0x30)
[<c054b594>] (arch_timer_handler_phys) from [<c0163688>] (handle_percpu_devid_irq+0x74/0x134)
[<c0163688>] (handle_percpu_devid_irq) from [<c015eb30>] (generic_handle_irq+0x24/0x34)
[<c015eb30>] (generic_handle_irq) from [<c015f054>] (__handle_domain_irq+0x5c/0xb4)
[<c015f054>] (__handle_domain_irq) from [<c0101438>] (gic_handle_irq+0x48/0x8c)
[<c0101438>] (gic_handle_irq) from [<c010bfcc>] (__irq_svc+0x6c/0x90)
Exception stack(0xd470dda8 to 0xd470ddf0)
dda0:                   bf002304 00000000 00000001 00000000 bf002300 ffffe000
ddc0: d47a3ec0 d47a3000 00000000 bf002100 00000000 00000024 00000000 d470ddf8
dde0: bf00204c c06a4a04 80000013 ffffffff
[<c010bfcc>] (__irq_svc) from [<c06a4a04>] (_raw_spin_lock+0x2c/0x40)
[<c06a4a04>] (_raw_spin_lock) from [<bf00204c>] (hack_spinBA+0x20/0x2c [spin_lock_deadlock])
[<bf00204c>] (hack_spinBA [spin_lock_deadlock]) from [<d47a3ec0>] (0xd47a3ec0)

 


Reference article

    https://blog.csdn.net/ccwzhu/article/details/81171092
    The formation of
   deadlock Linux deadlock detection module Lockdep Introduction  

 

Guess you like

Origin blog.csdn.net/Ivan804638781/article/details/100740857