Redirection of standard output and standard error in Linux

If a command needs to be run on the server for a long time, then the nohup command is often used. At this time, even if the remote login ssh interrupts the connection with the server, the command running on the server will not be forced to stop because of this.

Normally, nohup is used in conjunction with &, which means to execute the command in the background. as follows:

nohup example.sh &

Put exmaple.sh through & to run in the background of the server, nohup ensures that even if the current ssh remote connection is interrupted, example.sh can still be unaffected and continue to run on the remote server.

Recently, there are two paired sequencing files that need to be compared to the reference genome, which can be completed by bwa. At the same time, because the file is relatively large and the running time is long, in order to avoid ssh interruption caused by unstable network connection, use nohup

nohup bwa mem ref.fa read1.fq.gz read2.fq.gz > read12.sam &

Combine the two sequencing files and generate a sam file.
Write the command, press an enter key, "pop", that's cool. Then leave it alone, after more than ten hours, a very large file read12.sam was generated.
However, when using the sam file to generate the bam file, it prompts that there is an error in the sam file! ! ! [More than ten hours of calculation was in vain] After
carefully checking the sam file, I found that the results of the program running and the instructions during the program running were output to the same file!

1. Standard output and standard error

Standard output (standard output) is the default output place of the result, such as in bash,

$ echo 'hello'
hello

In the default state, the output of'hello' is displayed on your terminal (terminal).
For another example, display a text file through the cat command,

$ cat hello.txt
Hello!
This is a test!

However, if the text file does not exist in the current path, an error will be output:

$ cat No_exist.txt
cat: No_exist.txt: No such file or directory
The output content "cat: No_exist.txt: No such file or directory" at this time is the standard error.

2. Redirect output

By default, both standard output and standard error are displayed in the terminal. If the standard output is not output in the terminal, but output to another file, this time is redirected output, which can be done through the ">" symbol.

echo 'hello' > hello.txt

Output "hello" to the hello.txt file, and the system will create the new file. If the file exists in the path, the old file will be overwritten.
You can also use the ">>" symbol, so that the old file will not be overwritten, and "hello" will be added to the old file.
So, here is a question, can all output to the terminal be redirected to a file? For example, if it is standard error, can it be entered into the file via ">"?

$ cat No_exist.txt > output.txt
cat: No_exist.txt: No such file or directory

It turns out: No! output.txt is still an empty file, and the error content does not appear in the file, it is still displayed in the terminal! Therefore, the standard error cannot be output to the file directly through ">"!
So how can the standard error be output to the file?

3. File descriptor

In bash, 3 integers are usually used to represent standard input (0), standard output (1) and standard error (2).
If you want to output the standard error to a file, you can use

cat No_exist.txt 2> tt.txt

At this time, the standard error "cat: No_exist.txt: No such file or directory" will appear in the tt.txt file.
In the same way, we can redirect standard error to standard output, 2>&1
such as

$ cat No_exist.txt 
cat: No_exist.txt: No such file or directory
$ cat No_exist.txt 2>&1
cat: No_exist.txt: No such file or directory

Although their output on the terminal does not look different, their identities are different. The first one is output as standard error, and the second one is standard output. We can verify their differences through the pipe symbol.

$ cat No_exist.txt | sed 's/or/and/'
cat: No_exist.txt: No such file or directory
$ cat No_exist.txt 2>&1| sed 's/or/and/'
cat: No_exist.txt: No such file and directory

Now you can see the difference. The first standard error cannot be replaced with "and" through the pipe symbol, and the second is standard output. You can replace the "or" with "and" through the pipe symbol. . By the
same reason, standard output can be redirected to standard error "1> 2&"

So looking back, look at the first question, why does nohup output the calculation result and the description of the calculation process to the same sam file?
For simplicity, the error in nohup is reproduced with the following code (example.sh).

#!/bin/bash

echo "this is outcome!"
sleep 1
echo "sleep for 1s" >&2
echo "this is outcome, too!"
sleep 2
echo "second sleep for 2s" >&2

The description of the sleep process appears in the form of standard error through >&2, and the output is output in the form of standard output.
Under normal circumstances, run:

$ ./example.sh > outcome.txt
sleep for 1s
second sleep for 2s

$ cat outcome.txt 
this is outcome!
this is outcome, too!

The standard error is directly output to the terminal, and the running result is output to the output.txt without any problems. But in the case of nohup, this situation has changed.
In the description of nohup, it is mentioned that "If the standard output is in the terminal, the output content will be added to the'nohup.out' file; if the standard error output is in the terminal, the content will be redirected to Standard output". This means that, unless otherwise specified, standard output and standard error will be redirected to the same place. as follows,

$ nohup ./example.sh 
appending output to nohup.out

$ cat nohup.out 
this is outcome!
sleep for 1s
this is outcome, too!
second sleep for 2s

In the nohup.txt file, not only the running results I want, but also the running process I don't want! This also explains why there are a lot of content in my sam file that shouldn't belong to the file.
Now that the reason is known, it is not difficult to solve the problem. We can output the running result and running process to different files by redirecting the output.

$ nohup ./example.sh 2>stderr.log 1>outcome.txt

$ cat stderr.log 
sleep for 1s
second sleep for 2s

$ cat outcome.txt 
this is outcome!
this is outcome, too!

In this way, the process is output to stderr.log in the form of standard error, and the result is output to output.txt in the form of standard output.
So the command mentioned at the beginning of this article can be changed to:

nohup bwa mem ref.fa read1.fq.gz read2.fq.gz 1> read12.sam 2>read12.log &

to sum up

Nohup under default conditions, standard error and standard output will be redirected to the same file; through file descriptors (0, 1, 2) to control the output content; develop good output control habits, to standard output and standard Errors should be treated differently.

===== THE END =====

Reference materials:

https://robots.thoughtbot.com/input-output-redirection-in-the-shell#file-descriptors
https://www.brianstorti.com/understanding-shell-script-idiom-redirect/

Redirection of standard output and standard error in Linux