Now more and more people are using linux system , today I will introduce the join command to everyone |
1. Command Introduction
The two files are horizontally spliced by Cartesian products according to the same specified fields, and output to standard output. By default, the join field separator is a space or Tab. When joining, the two files need to be sorted according to a certain field.
The Cartesian product refers to the set of ordered pairs formed by the combination of the members of two sets X and Y. For example, the set X={a,b}, Y={0,1,2}, then
X×Y={(a, 0), (a, 1), (a, 2), (b, 0), (b, 1), (b, 2)} Y×X={(0, a), (0, b), (1, a), (1, b), (2, a), (2, b)}
2. Command format
join [OPTIONS] FILE1 FILE2
When FILE1 or FILE2 is a hyphen-(the two cannot be-at the same time), then the content is read from standard input.
3. Option description
-a FILENUM In addition to displaying the original output content, it also displays lines that do not have the same field in the file. The value of FILENUM is 1 or 2, which corresponds to FILE1 and FILE2 respectively. -e EMPTY If the specified column cannot be found in FILE1 and FILE2, the string in the option -i, --igore-case comparison column content is filled in the output Ignore case when -j FIELD is equivalent to -1 FIELD -2 FIELD -o FORMAT displays the result in the specified format -t CHAR specifies the separator character for input and output columns -v FILENUM is similar to -a FILENUM, but only displays Rows that do not have the same column in the file- 1 FIELD is connected to the column specified by FILE1. FIELD takes 1 for the first column, 2 for the second column, and so on. -2 FIELD is connected to the column specified by FILE2. FIELD takes 1 for the first column, 2 for the second column, and so on. --check-order default option, check whether the file has been sorted --nocheck-order does not check whether the file is sorted --help display help information and exit --version display version information and exit
4. Common examples
(1) To connect two files, the first column is used as the connection field by default.
# file1 The content is as follows lvlv dablelv 25 zhangsan San 12 # file2 The content is as follows lvlv english 15 lvlv math 75 zhangsan math 14 zhouxun english 45 join file1 file2 lvlv dablelv 25 english 15 lvlv dablelv 25 math 75 zhangsan San 12 math 14
(2) Taking the above two files as an example, the display indicates that the connection should be performed according to the Chinese name in the first column.
join -j 1 file1 file2 # 或 join -1 1 -2 1 file1 file2
(3) If you want to display rows that do not have the same field, use -a1 or -a2 to specify the rows of the first or second file to be displayed.
join -a2 file1 file2 lvlv dablelv 25 english 15 lvlv dablelv 25 math 75 zhangsan San 12 math 14 zhouxun english 45 //shows the unmatched line in file2