19. Detailed explanation of character processing sort and uniq commands in Linux

content

1.sort sort command syntax and parameters

2. Sort command usage demonstration

3.uniq deduplication command syntax and parameters

4. uniq command usage demonstration


1.sort sort command syntax and parameters

In many cases, unordered data needs to be sorted, and then sort is used.

grammar

sort [-bcdfimMnr][-o<output file>][-t<delimiter>][+<start field>-<end field>][--help][--verison][file][ -k field1[,field2]]

Parameter description :

  • -b Ignore whitespace characters at the beginning of each line.
  • -c Check if the file is already sorted in order.
  • -d When sorting, except for English letters, numbers and space characters, other characters are ignored.
  • -f Treat lowercase letters as uppercase when sorting.
  • -i Ignore characters except ASCII characters between 040 and 176 when sorting.
  • -m Merge several sorted files.
  • -M Sort the first 3 letters by month abbreviation.
  • -n Sort by numerical value.
  • -u means unique, and the output result is deduplicated.
  • -o <output file> Save the sorted results to the specified file.
  • -r Sort in reverse order.
  • -t<separation character> Specifies the column separation character to use when sorting.
  • +<start column>-<end column> Sort by the specified column, the range is from the start column to the previous column of the end column.
  • --help Display help.
  • --version Display version information.
  • [-k field1[,field2]] Sort by the specified column.

2. Sort command usage demonstration

When using the sort command to sort the lines of a file by default, use the following command:

[root@xiaopeng ~]# sort testfile

 The sort command will, by default, sort the first column of a text file in ASCII order and output the result to standard output.

Use the cat command to display the testfile file to see that its original order is as follows:

[root@xiaopeng ~]# cat testfile      # testfile文件原有排序
test 30  
Hello 95  
Linux 85 

The result after rearranging using the sort command is as follows:

[root@xiaopeng ~]# sort testfile # 重排结果
Hello 95  
Linux 85  
test 30 

Using the -k parameter setting to rearrange the values ​​of the second column, the result is as follows:

[root@xiaopeng ~]# sort testfile -k 2
test 30  
Linux 85 
Hello 95 

3.uniq deduplication command syntax and parameters

The uniq command is a command used to cancel duplicate lines, which is actually the same as the sort -u option.

用法:uniq [选项]... [文件]

从输入文件或者标准输入中筛选相邻的匹配行并写入到输出文件或标准输出。

不附加任何选项时匹配行将在首次出现处被合并。

长选项必须使用的参数对于短选项时也是必需使用的。

  -c, --count               在每行前加上表示相应行目出现次数的前缀编号

  -d, --repeated                 只输出重复的行

  -D, --all-repeated[=delimit-method    显示所有重复的行

  delimit-method={none(default),prepend,separate} 以空行为界限

  -f, --skip-fields=N              比较时跳过前N 列

  -i, --ignore-case                                         在比较的时候不区分大小写

  -s, --skip-chars=N               比较时跳过前N 个字符

  -u, --unique                   只显示唯一的行

  -z, --zero-terminated             使用'\0'作为行结束符,而不是新换行

  -w, --check-chars=N              对每行第N 个字符以后的内容不作对照

      --help                 显示此帮助信息并退出

      --version               显示版本信息并退出

若域中为先空字符(通常包括空格以及制表符),然后非空字符,域中字符前的空字符将被跳过。

提示:uniq 不会检查重复的行,除非它们是相邻的行。

如果您想先对输入排序,使用没有uniq 的"sort -u"

Remove duplicate lines:

[root@xiaopeng ~]# uniq file.txt
[root@xiaopeng ~]# ​sort file.txt | uniq
[root@xiaopeng ~]# sort -u file.txt

Show only a single line:

[root@xiaopeng ~]# uniq -u file.txt
[root@xiaopeng ~]# sort file.txt | uniq -u

Count the number of times each line appears in the file:

[root@xiaopeng ~]# sort file.txt | uniq -c

Find duplicate lines in a file:

[root@xiaopeng ~]# sort file.txt | uniq -dl

4.uniq命令用法演示

The following command removes adjacent duplicate lines, but the first line 111 is not removed.

[root@xiaopeng ~]# uniq uniq.txt
111
223
56
111   # 删除了重复的111
567
223

Deduplication after sorting.

[root@xiaopeng ~]# sort uniq.txt | uniq
111
223
56
567

Use -d to display duplicate lines.

[root@xiaopeng ~]# sort uniq.txt | uniq  -d
111
223

Use -D to display all duplicated lines.

[root@xiaopeng ~]# sort uniq.txt | uniq  -D
111
111
111
223
223

Use -u to display unique lines.

[root@xiaopeng ~]# sort uniq.txt | uniq  -u
56
567

Use -c to count which records occur the number of times.

[root@xiaopeng ~]# sort uniq.txt | uniq  -c 
3 111
2 223
1 56
1 567

Use -d -c to count the number of occurrences of duplicate lines.

[root@xiaopeng ~]# sort uniq.txt | uniq  -d -c
3 111
2 223

Guess you like

Origin blog.csdn.net/weixin_46659843/article/details/124092976