关于 /dev/null 差点直播吃鞋的一个小问题

我们的定时任务、异步 MQ 的 jar 包程序等都会使用 System.in.read() 等阻塞程序，防止程序退出，在本地测试一直都没有问题，直到有同学反馈，线上 Docker 环境中代码 System.in.read() 没有阻塞，执行到了后面的程序，简化过的代码如下所示。

public static void main(String[] args) throws IOException, InterruptedException {
    System.out.println("enter main....");
    // 启动定时任务
    startJobSchedule();
    
    System.out.println("before system in read....");
    System.in.read();
    System.out.println("after system in read....");
}
复制代码

我瞄了一眼，觉得不可能，代码肯定会阻塞在 System.in.read()，然后说如果输出了 "after system in read...."，我直播吃鞋。结果一试，确实 System.in.read(); 退出了，执行了后续的语句，马上鞋就端上来，嗯，真香。

通过阅读这篇文章，你会了解到下面这些知识。

进程与文件描述符 fd 的关系
/dev/null 文件的来龙去脉，读取写入的内核源码分析
重定向本质
管道概念初探

进程与文件描述符 fd

接下来我们先来看看进程与文件描述符 fd 之间的关系。一个进程启动以后，除了会分配堆、栈空间以外，还会默认分配三个文件描述符句柄：0 号标准输入(stdin)、1 号标准输出(stdout)、2 号错误输出(stderr)，如下所示。

接下来了分析了一下开头的案例，System.in.read() 实际上是从 fd 为 0 的 stdin 读数据，我们将 System.in.read() 的返回值和读到的内容打印出来，经过实验，返回值为 -1，读到了 EOF。这比较奇怪，为什么去读 stdin 会返回 EOF 呢？

接下来去看 fd 为 0 的 stdin 到底指向了什么。在系统的 /proc/pid/fd 目录存储了进程所有打开的文件句柄，使用 ls 查看当前打开的句柄列表如下所示。

$ ls -l /proc/1/fd
total 0
lrwx------ 1 root root 64 4月   3 17:13 0 -> /dev/null
l-wx------ 1 root root 64 4月   3 17:13 1 -> pipe:[31508]
l-wx------ 1 root root 64 4月   3 17:13 2 -> pipe:[31509]
l-wx------ 1 root root 64 4月   3 17:13 3 -> /app/logs/gc.log
lr-x------ 1 root root 64 4月   3 17:13 4 -> /jdk8/jre/lib/rt.jar
lr-x------ 1 root root 64 4月   3 17:13 5 -> /app/system-in-read-1.0-SNAPSHOT.jar
复制代码

可以看到为 0 的 fd 指向了 /dev/null。接下来看看 /dev/null 相关的知识。

/dev/null 文件

/dev/null 文件是什么

/dev/null 是一个特殊的设备文件，所有接收到的数据都会被丢弃。有人把 /dev/null 比喻为 “黑洞”，比较形象恰当。

除了丢弃所有的写入这个特性之外，从 /dev/null 读数据会立即返回 EOF，这就是造成前面 System.in.read() 调用直接退出的原因。

使用 stat 查看 /dev/null，输出的结果如下。

$ stat /dev/null
  File: ‘/dev/null’
  Size: 0         	Blocks: 0          IO Block: 4096   character special file
Device: 5h/5d	Inode: 6069        Links: 1     Device type: 1,3
Access: (0666/crw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:null_device_t:s0
Access: 2020-03-27 19:27:37.857000000 +0800
Modify: 2020-03-27 19:27:37.857000000 +0800
Change: 2020-03-27 19:27:37.857000000 +0800

$ who -b
         system boot  2020-03-27 19:27
复制代码

可以看到 /dev/null 文件的大小为 0，创建、修改时间都与内核系统启动时间一致。它并不是一个磁盘文件，而是存在于内存中类型为 “character device file” 的文件。

所有的往这个文件的写入的数据会被丢弃，write 调用会是始终返回成功，这个特殊的文件不会被填满，也不能更改它的文件大小。

还有一个有趣的现象是使用 tail -f /dev/null 会永久阻塞，strace 命令输出结果精简如下所示。

$ strace tail -f /dev/null

open("/dev/null", O_RDONLY)             = 3
read(3, "", 8192)                       = 0
inotify_init()                          = 4
inotify_add_watch(4, "/dev/null", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 1
read(4,
复制代码

可以看到 tail -f 在执行过程中读取 /dev/null 的 read 调用返回了 0，表明它读取遇到了 EOF，随后 tail 使用 inotify_init 系统调用创建了一个 inotify 实例，这个实例监听了 /dev/null 文件的 IN_MODIFY、IN_ATTRIB、IN_DELETE_SELF、IN_DELETE_SELF 事件。这四个事件的含义如下。

IN_MODIFY：文件被修改
IN_ATTRIB：文件元数据修改
IN_DELETE_SELF：监听目录/文件被删除
IN_MOVE_SELF：监听目录/文件被移动

随后阻塞等待这些事件的发生，因为 /dev/null 不会发生这些事件，所以 tail 命令之后会一直阻塞。

从源码角度看 /dev/null

内核处理 /dev/null 的逻辑在 github.com/torvalds/li… ，往 /dev/null 写入数据的代码在 write_null 函数，这个函数的源码如下所示。

static ssize_t write_null(struct file *file, const char __user *buf,
			  size_t count, loff_t *ppos)
{
	return count;
}
复制代码

可以看到往 /dev/null 写入数据，内核没有做任何处理，只是返回了传入的 count 值。

读取的代码在 read_null 函数，这个函数的逻辑如下所示。

static ssize_t read_null(struct file *file, char __user *buf,
			 size_t count, loff_t *ppos)
{
	return 0;
}
复制代码

可以看到，读取 /dev/null 会立即返回 0，表示 EOF。

至此，/dev/null 相关知识就介绍到这里。为什么本机测试没有出现问题？因为本机测试是用终端 terminal 去启动 jar 包，这样进程的 stdin 会被分配为键盘输入，在不输入字符的情况下，会始终阻塞。接下来我们来看看怎么在本地复现这个问题。

文件描述符与重定向

前面介绍的标准输入、标准输出、错误输出在描述符中的位置不会变化，但是它们的指向是可以改变的，我们用到的重定向操作符 > 和 < 就是用来重定向数据流的。为了修改上面进程的标准输入为 /dev/null，只需要使用 < 重定向符即可。修改前面的代码，加上 sleep 不让其退出。

public static void main(String[] args) throws IOException, InterruptedException {
    System.out.println("enter main....");
    byte[] buf = new byte[16];
    System.out.println("before system in read....");
    int length = System.in.read();
    System.out.println("len: " + length + "\t" + new String(buf));
    TimeUnit.DAYS.sleep(1);
}
复制代码

打包运行，输出结果如下。

$ java -jar system-in-read-1.0-SNAPSHOT.jar < /dev/null

enter main....
before system in read....
len: -1
复制代码

可以看到出现了与线上 docker 环境一样的现象，System.in.read() 没有阻塞，返回了 -1。

查看进程的 fd 列表如下所示：

$ ls -l  /proc/482/fd

lr-x------. 1 ya ya 64 4月   3 20:00 0 -> /dev/null
lrwx------. 1 ya ya 64 4月   3 20:00 1 -> /dev/pts/6
lrwx------. 1 ya ya 64 4月   3 20:00 2 -> /dev/pts/6
lr-x------. 1 ya ya 64 4月   3 20:00 3 -> /usr/local/jdk/jre/lib/rt.jar
lr-x------. 1 ya ya 64 4月   3 20:00 4 -> /home/ya/system-in-read-1.0-SNAPSHOT.jar
复制代码

可以看到此时的标准输入已经被替换为了 /dev/null，System.in.read() 调用时读取标准输入会先来查这个文件描述符列表，看 0 号描述符指向的是哪条数据流，再从这个数据流里读取数据。

上面的例子重定向了标准输入，标准输出和标准错误输出也是可以用类似的方式重定向。

1> 或者 > 重定向标准输出
2> 重定向标准错误输出

或者可以组合使用：

java -jar system-in-read-1.0-SNAPSHOT.jar </dev/null > stdout.out 2> stderr.out

$ ls -l /proc/2629/fd

lr-x------. 1 ya ya 64 4月   3 20:35 0 -> /dev/null
l-wx------. 1 ya ya 64 4月   3 20:35 1 -> /home/ya/stdout.out
l-wx------. 1 ya ya 64 4月   3 20:35 2 -> /home/ya/stderr.out
复制代码

可以看到这次 fd 为 0、1、2 的文件描述符都被替换了。

shell 脚本中经常看到的 2>&1 是什么意思

拆解来看，2> 表示重定向 stderr ，&1 表示 stdout，连起来的含义就是将标准错误输出 stderr 改写为标准输出 stdout 相同的输出方式。比如将标准输出和标准错误输出都重定向到文件可以这么写。

cat foo.txt > output.txt 2>&1
复制代码

接下来继续看文件描述符与管道相关的概念。

管道

管道是一个单向的数据流，我们在命令行中经常会用到管道来连接两条命令，以下面的命令为例。

nc -l 9090 | grep "hello" | wc -l
复制代码

运行上面的命令，实际上的执行过程如下

命令行创建的 zsh 进程
zsh 进程启动了 nc -l 9090 进程
zsh 进程启动了 grep 进程，同时将 nc 进程的标准输出通过管道的方式连接到 grep 进程的标准输入
zsh 进程启动了 wc 进程，同时将 grep 进程的标准输出通过管道的方式连接到 wc 进程的标准输入

他们的进程关系如下所示。

  PID TTY      STAT   TIME COMMAND
23714 ?        Ss     0:00  \_ sshd: ya [priv]
23717 ?        S      0:00  |   \_ sshd: ya@pts/5  
23718 pts/5    Ss     0:00  |       \_ -zsh
 4812 pts/5    S+     0:00  |           \_ nc -l 9090
 4813 pts/5    S+     0:00  |           \_ grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exc
 4814 pts/5    S+     0:00  |           \_ wc -l
复制代码

查看 nc 和 grep 两个进程的文件描述符列表如下。


$ ls -l /proc/pid_of_nc/fd                                                                     

lrwx------. 1 ya ya 64 4月   3 21:22 0 -> /dev/pts/5
l-wx------. 1 ya ya 64 4月   3 21:22 1 -> pipe:[3852257]
lrwx------. 1 ya ya 64 4月   3 21:17 2 -> /dev/pts/5


$ ls -l /proc/pid_of_grep/fd

lr-x------. 1 ya ya 64 4月   3 21:22 0 -> pipe:[3852257]
l-wx------. 1 ya ya 64 4月   3 21:22 1 -> pipe:[3852259]
lrwx------. 1 ya ya 64 4月   3 21:17 2 -> /dev/pts/5

$ ls -l /proc/pid_of_wc/fd

lr-x------. 1 ya ya 64 4月   3 21:22 0 -> pipe:[3852259]
lrwx------. 1 ya ya 64 4月   3 21:22 1 -> /dev/pts/5
lrwx------. 1 ya ya 64 4月   3 21:17 2 -> /dev/pts/5
复制代码

关系如下图所示。

在 linux 中，创建管道的函数是 pipe，常见的创建管道的方式如下所示。

int fd[2];
if (pipe(fd) < 0) {
    printf("%s\n", "pipe error");
    exit(1);
}
复制代码

pipe 函数创建了一个管道，同时返回了两个文件描述符，fd[0] 用来从管道读数据，fd[1] 用来向管道写数据，接下来我们来看一段代码，看下父子进程如何通过管道来进行通信。

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

#define  BUF_SIZE 20
int main() {
  int fd[2];
  if (pipe(fd) < 0) {
    printf("%s\n", "pipe error");
    exit(1);
  }
  int pid;
  if ((pid = fork()) < 0) {
    printf("%s\n", "fork error");
    exit(1);
  }

  // child process
  if (pid == 0) {
    close(fd[0]); // 关闭子进程的读
    while (1) {
      int n = write(fd[1], "hello from child\n", 18);
      if (n < 0) {
        printf("write eof\n");
        exit(1);
      }
      sleep(1);
    }
  }

  char buf[BUF_SIZE];
  // parent process
  if (pid > 0) {
    close(fd[1]); // 关闭父进程的写
    while (1) {
      int n = read(fd[0], buf, BUF_SIZE);
      if (n <= 0) {
        printf("read error\n");
        exit(1);
      }
      printf("read from parent: %s", buf);
      sleep(1);
    }
  }
  return 0;
}
复制代码

执行上面的代码，就可以看到从子进程写入的字符串，在父进程中可以读取并显示在终端中了。

$ ./pipe_test
read from parent: hello from child
read from parent: hello from child
read from parent: hello from child
read from parent: hello from child
read from parent: hello from child
复制代码

docker 与 stdin

如果想让 docker 进程的 stdin 变为键盘终端，可以用 -it 选项启动 docker run。运行镜像以后，重新查看进程打开的文件描述符列表，可以看到 stdin、stdout、stderr 都已经发生了变化，如下所示。

$ docker exec -it 5fe22fbffe81 ls -l /proc/1/fd

total 0
lrwx------ 1 root root 64 4月   5 23:20 0 -> /dev/pts/0
lrwx------ 1 root root 64 4月   5 23:20 1 -> /dev/pts/0
lrwx------ 1 root root 64 4月   5 23:20 2 -> /dev/pts/0
复制代码

java 进程也阻塞在了 System.in.read() 调用上。

小结

这篇文章从一个小例子介绍了进程相关的三个基础文件描述符：stdin、stdout、stderr，以及这三个文件描述符如何进行重定向。顺带介绍了一下管道相关的概念，好了，鞋吃饱了，睡觉。

有问题可以扫描下面的二维码关注我的公众号到联系我。