When we use the Linux system, if there is a problem with the I/O such as the network or the disk, we will find that the process is stuck, and the process kill -9
can , and many commonly used debugging tools, such as strace
. pstack
thing?
At this point, we use to ps
view the process list, we can see that the stuck process status is displayed as D.
man ps
The D state described in is Uninterruptible Sleep.
Linux processes have two sleep states:
- Interruptible Sleep , interruptible sleep, displays S in the ps command. A process in this sleep state can be woken up by sending a signal to it.
- Uninterruptible Sleep , uninterruptible sleep, displays D in the ps command. A process in this sleep state cannot immediately handle any signals sent to it, which is why it cannot be killed with kill.
There is an answer on Stack Overflow:
kill -9
It just sends aSIGKILL
signal . When a process is in a special state (signal processing, or in a system call), it will not be able to handle any signals, includingSIGKILL
and can not be handled correctly, so the process cannot be killed immediately, that is, we often sayD
state (uninterruptible sleep state). Those commonly used debugging tools (such asstrace
,pstack
etc.) are generally implemented by using a special signal, and cannot be used in this state.
It can be seen that the process in the D state is generally in a kernel-mode system call, so how do you know which system call it is and what is it waiting for? Fortunately, Linux provides procfs (that is, the /proc directory under Linux), through which you can see the current kernel call stack of any process. Next, we use the process of accessing JuiceFS to simulate (because the JuiceFS client is based on FUSE, which is a user-mode file system, it is easier to simulate I/O failure).
First mount JuiceFS to the foreground (add a -f parameter to the ./juicefs mount
command ), and then use Cltr+Z to stop the process. At this time, use ls /jfs
to access the mount point, and you will find that it is ls
stuck .
You can see by the following command that ls is stuck on the vfs_fstatat
call , it will send a getattr
request to the FUSE device, and it is waiting for a response. And the JuiceFS client process has been stopped by us, so it is stuck:
$ cat /proc/`pgrep ls`/stack
[<ffffffff813277c7>] request_wait_answer+0x197/0x280
[<ffffffff81327d07>] __fuse_request_send+0x67/0x90
[<ffffffff81327d57>] fuse_request_send+0x27/0x30
[<ffffffff8132b0ac>] fuse_simple_request+0xcc/0x1a0
[<ffffffff8132c0f0>] fuse_do_getattr+0x120/0x330
[<ffffffff8132df28>] fuse_update_attributes+0x68/0x70
[<ffffffff8132e33d>] fuse_getattr+0x3d/0x50
[<ffffffff81220c6f>] vfs_getattr_nosec+0x2f/0x40
[<ffffffff81220ee6>] vfs_getattr+0x26/0x30
[<ffffffff81220fc8>] vfs_fstatat+0x78/0xc0
[<ffffffff8122150e>] SYSC_newstat+0x2e/0x60
[<ffffffff8122169e>] SyS_newstat+0xe/0x10
[<ffffffff8186281b>] entry_SYSCALL_64_fastpath+0x22/0xcb
[<ffffffffffffffff>] 0xffffffffffffffff
At this time, pressing Ctrl+C cannot exit.
root@localhost:~# ls /jfs
^C
^C^C^C^C^C
But using strace
can wake it up, and start processing the previous interrupt signal, and then exit.
root@localhost:~# strace -p `pgrep ls`
strace: Process 26469 attached
--- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=13290, si_uid=0} ---
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
。。。
tgkill(26469, 26469, SIGINT) = 0
--- SIGINT {si_signo=SIGINT, si_code=SI_TKILL, si_pid=26469, si_uid=0} ---
+++ killed by SIGINT +++
At this time, if you use kill -9
, you can also kill it:
root@localhost:~# ls /jfs
^C
^C^C^C^C^C
^C^CKilled
Because vfs_lstatat()
this simple system call does not shield SIGKILL
, SIGQUIT
, SIGABRT
and other signals, it can also do some conventional processing.
Let's simulate a more complex I/O error, configure an unwritable storage type for JuiceFS, mount it, and use cp to try to write data into it. At this time, cp will also get stuck:
root@localhost:~# cat /proc/`pgrep cp`/stack
[<ffffffff813277c7>] request_wait_answer+0x197/0x280
[<ffffffff81327d07>] __fuse_request_send+0x67/0x90
[<ffffffff81327d57>] fuse_request_send+0x27/0x30
[<ffffffff81331b3f>] fuse_flush+0x17f/0x200
[<ffffffff81218fd2>] filp_close+0x32/0x80
[<ffffffff8123ac53>] __close_fd+0xa3/0xd0
[<ffffffff81219043>] SyS_close+0x23/0x50
[<ffffffff8186281b>] entry_SYSCALL_64_fastpath+0x22/0xcb
[<ffffffffffffffff>] 0xffffffffffffffff
How to get stuck in close_fd()? This is because writing data to JFS is asynchronous. When cp
calling write()
, the data will be cached in the client process of JuiceFS and asynchronously written to the backend storage. After cp
writing data, it will call close
to ensure that the data writing is completed. , corresponding to the flush
operation . When the JuiceFS client encounters flush
an operation , it needs to ensure that all the written data is persisted to the back-end storage, and the back-end storage fails to write, it is in the process of multiple retries, so the flush
operation is stuck, I haven't replied yet cp
, so cp
I 'm stuck too.
At this time, if you use Cltr+C kill
or interruptable cp
operation, because JuiceFS implements the interrupt processing of various file system operations, it will give up the current operation (for example flush
) and return EINTR
, so that it can interrupt the ongoing operation when encountering various network failures. Access the JuiceFS app .
At this time, if I stop the JuiceFS client process so that it can no longer process any FUSE requests (including interrupt requests), if I try to kill it at this time, it will not be killed, including kill -9
can not be killed. Use to ps
view the process status, it has been is the D
status .
root 1592 0.1 0.0 20612 1116 pts/3 D+ 12:45 0:00 cp parity /jfs/aaa
But this time it can be used to cat /proc/1592/stack
see its kernel call stack
root@localhost:~# cat /proc/1592/stack
[<ffffffff8132775d>] request_wait_answer+0x12d/0x280
[<ffffffff81327d07>] __fuse_request_send+0x67/0x90
[<ffffffff81327d57>] fuse_request_send+0x27/0x30
[<ffffffff81331b3f>] fuse_flush+0x17f/0x200
[<ffffffff81218fd2>] filp_close+0x32/0x80
[<ffffffff8123ac53>] __close_fd+0xa3/0xd0
[<ffffffff81219043>] SyS_close+0x23/0x50
[<ffffffff8186281b>] entry_SYSCALL_64_fastpath+0x22/0xcb
[<ffffffffffffffff>] 0xffffffffffffffff
The kernel call stack shows that it is stuck on a FUSE flush
call , and as long as the JuiceFS client process is resumed, it can be interrupted immediately cp
to let it exit.
close
Such an operation involving data security, no ,restartable
it cannot be interrupted at will, for example, the implementation of FUSE can only be interrupted by responding to the interrupt operation.SIGKILL
Therefore, as long as the JuiceFS client process can respond to interruptions in a healthy manner, there is no need to worry about the application that accesses JuiceFS getting stuck. Or killing the JuiceFS client process can also end the current mount point, interrupting all applications accessing the current mount point .
If it is helpful, please follow our project Juicedata/JuiceFS ! (0ᴗ0✿)