Why does a Rust project doing async network IO using Tokio have thousands of writes to file descriptor 5?

Gurwinder Singh :

I am profiling my code for system calls using strace. I found some surprising results. The trace shows 47254 single byte writes to file descriptor 5 while doing a network transfer of 200Mb data.

write(5, "\1", 1)

What does this write mean? What is fd 5? Where could it be originating from? Is there a way to find out?

I am not very well versed with Linux fundamentals.

Output of ls -lrt /proc/24393/fd

lrwx------ 1 95th 95th 64 Mar  1 20:56 9 -> 'socket:[97676]'
lr-x------ 1 95th 95th 64 Mar  1 20:56 7 -> /dev/random
lr-x------ 1 95th 95th 64 Mar  1 20:56 6 -> /dev/urandom
l-wx------ 1 95th 95th 64 Mar  1 20:56 5 -> 'pipe:[98345]'
lr-x------ 1 95th 95th 64 Mar  1 20:56 4 -> 'pipe:[98345]'
lrwx------ 1 95th 95th 64 Mar  1 20:56 3 -> 'anon_inode:[eventpoll]'
lrwx------ 1 95th 95th 64 Mar  1 20:56 2 -> /dev/pts/0
lrwx------ 1 95th 95th 64 Mar  1 20:56 1 -> /dev/pts/0
lrwx------ 1 95th 95th 64 Mar  1 20:56 0 -> /dev/pts/0

I checked what that pipe is (though that didn't help much):

/proc/24393/fd$ lsof | grep 98345
btrs      24393       95th    4r     FIFO               0,11       0t0            98345 pipe
btrs      24393       95th    5w     FIFO               0,11       0t0            98345 pipe
tokio-run 24393 24394 95th    4r     FIFO               0,11       0t0            98345 pipe
tokio-run 24393 24394 95th    5w     FIFO               0,11       0t0            98345 pipe
tokio-run 24393 24395 95th    4r     FIFO               0,11       0t0            98345 pipe
tokio-run 24393 24395 95th    5w     FIFO               0,11       0t0            98345 pipe
user1937198 :

These writes are used by mio (as part of the tokio implementation) to wake up worker threads that are in a epoll_wait syscall, when they are awoken by something other than a file descriptor trigger. Since the threads are blocked in the OS on a syscall, this requires a syscall of some sort to tell the OS to unblock them. This could be caused by a channel. If you are seeing this, then that would suggest you have workers that are idle. The alternatives to using this syscall are to keep those threads in a polling busywait (much more expensive in syscalls and CPU time), or to just not use the workers at all until they are woken by external I/O (limiting your concurrency). I would suggest you look at whether these are actually performance impact or caused by bottlenecks elsewhere in your application.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=33046&siteId=1