irq 18: nobody cared (try booting with the "irqpoll" option)

版权声明:转载请关注我的公众号-青儿创客基地 https://blog.csdn.net/Zhu_Zhu_2009/article/details/89350928

参考

irq 29: nobody cared (try booting with the “irqpoll” option) 问题说明
Linux 内核引导选项简介

问题

调试nvme存储发现,

irq 18: nobody cared (try booting with the "irqpoll" option)
CPU: 2 PID: 1477 Comm: kworker/2:3 Tainted: G           O    4.1.35-rt41-fdk-1.0.0-20190116.1935 #35
Workqueue: events .nvme_async_probe [nvme]
Call Trace:
[c0000000fffab520] [c000000000872e6c] .dump_stack+0xac/0xec (unreliable)
[c0000000fffab5b0] [c0000000000852d0] .__report_bad_irq+0x4c/0x138
[c0000000fffab650] [c000000000085a3c] .note_interrupt+0x2f0/0x34c
[c0000000fffab700] [c000000000082054] .handle_irq_event_percpu+0x150/0x200
[c0000000fffab7d0] [c00000000008215c] .handle_irq_event+0x58/0xa4
[c0000000fffab850] [c0000000000861bc] .handle_fasteoi_irq+0xd4/0x280
[c0000000fffab8d0] [c0000000000813b0] .generic_handle_irq+0x4c/0x70
[c0000000fffab950] [c000000000005ac8] .__do_irq+0x5c/0xa8
[c0000000fffab9c0] [c000000000005bfc] .do_IRQ+0xe8/0x118
[c0000000fffaba50] [c000000000000d28] restore_check_irq_replay+0x2c/0x70
--- interrupt: 501 at .arch_local_irq_restore+0x60/0x70
    LR = .arch_local_irq_restore+0x60/0x70
[c0000000fffabd40] [c000000000068900] .vtime_common_account_irq_enter+0x34/0x60 (unreliable)
[c0000000fffabdb0] [c00000000003d43c] .__do_softirq+0xd8/0x314
[c0000000fffabeb0] [c00000000003dba8] .irq_exit+0xb8/0xe4
[c0000000fffabf20] [c000000000005ad0] .__do_irq+0x64/0xa8
[c0000000fffabf90] [c000000000013018] .call_do_irq+0x14/0x24
[c0000000f1c033b0] [c000000000005b98] .do_IRQ+0x84/0x118
[c0000000f1c03440] [c00000000001793c] exc_0x500_common+0xfc/0x100
--- interrupt: 501 at .arch_local_irq_restore+0x60/0x70
    LR = .arch_local_irq_restore+0x60/0x70
[c0000000f1c03730] [c0000000f1c03820] 0xc0000000f1c03820 (unreliable)
[c0000000f1c037a0] [c00000000086ffb4] ._raw_spin_unlock_irqrestore+0x60/0x74
[c0000000f1c03810] [c000000000084514] .__setup_irq+0x478/0x7d4
[c0000000f1c038c0] [c000000000084a58] .request_threaded_irq+0x110/0x23c
[c0000000f1c03970] [80000000009211e0] .queue_request_irq+0x4c/0x8c [nvme]
[c0000000f1c039e0] [8000000000922ab8] .nvme_dev_start.part.45+0x1ac/0x4e4 [nvme]
[c0000000f1c03ab0] [80000000009231d0] .nvme_async_probe+0xec/0x670 [nvme]
[c0000000f1c03ba0] [c000000000053424] .process_one_work+0x1f8/0x438
[c0000000f1c03c40] [c0000000000537e4] .worker_thread+0x180/0x5b0
[c0000000f1c03d30] [c0000000000595a4] .kthread+0xf0/0x110
[c0000000f1c03e30] [c000000000000998] .ret_from_kernel_thread+0x58/0xc0
handlers:
[<800000000092b3d0>] .nvme_irq [nvme]
[<800000000092b3d0>] .nvme_irq [nvme]
Disabling IRQ #18

搜索发现,当一个中断号上有多个中断共享的时候,该中断来的时候,内核会依次调用共享该中断号的各个中断处理函数,如果中断处理函数检测到该中断不是自己的中断时就会返回IRQ_NONE,这时内核就会调用下一个中断处理函数,而这些中断处理函数中必须至少有一个返回IRQ_HANDLED告知内核该中断是自己的中断,已经正常处理,若内核依次调用完所有该中断号的中断处理函数仍未得到IRQ_HANDLED的返回值,内核就会报告上述错误,并在该中断出现一定次数后关闭该中断。即只有中断处理函数返回 IRQ_HANDLED ,这个中断才是被正确完成的的。
由于PCIe Switch上还有一个FPGA,怀疑是FPGA乱报中断导致。

内核中断相关引导参数

引导参数可通过u-boot bootargs传入。
[KNL]
threadirqs
强制线程化所有的中断处理器(明确标记为IRQF_NO_THREAD的除外)
[HW]
irqfixup
用于修复简单的中断问题:当一个中断没有被处理时搜索所有可用的中断处理器。用于解决某些简单的固件缺陷。
[HW]
irqpoll
用于修复高级的中断问题:当一个中断没有被处理时搜索所有可用的中断处理器,并且对每个时钟中断都进行搜索。用于解决某些严重的固件缺陷。

猜你喜欢

转载自blog.csdn.net/Zhu_Zhu_2009/article/details/89350928
try
今日推荐