43、uniq命令

相邻去重 uniq -c 表示相邻去重并统计；

1、uniq介绍：

uniq是对指定的ascii文件或标准输入进行唯一性检查，以判断文本文件中重复出现的行，常用于系统排查及日志分析；

2、命令格式：

uniq [options] [ file1 ] [file2]

uniq从已经排序号的文本文件file1中删除重复的行，输出到标注或file2，常作为过滤器，配合管道使用；

在使用uniq命令之前，必须确保操作的文本文件已经sort排序，若不带参数运行uniq，只会去除相邻的重复行；

-c #计数，命令选项是最常用的参数，通过整行进行计数的；

-i #忽略大小写；

3、应用实例：

（1）实验文件内容：

[root@backup scripts]# cat uniq.txt

10.0.0.9

10.0.0.8

10.0.0.7

10.0.0.8

10.0.0.9

（2）不带参数运行uniq，删除相邻重复的行；

[root@backup scripts]# uniq uniq.txt

10.0.0.9

10.0.0.8

10.0.0.7

10.0.0.8

10.0.0.9

（3）使用sort，将重复的行进行相邻，然后使用uniq删除重复的；

[root@backup scripts]# sort uniq.txt

10.0.0.7

10.0.0.8

10.0.0.9

[root@backup scripts]# sort uniq.txt | uniq

10.0.0.7

10.0.0.8

10.0.0.9

（4）sort uniq.txt | uniq 等价于sort -u uniq.txt

[root@backup scripts]# sort -u uniq.txt

10.0.0.7

10.0.0.8

10.0.0.9

（5）对重复的数据进行统计排序：

[root@backup scripts]# sort uniq.txt | uniq -c | sort -r #sort的 -r参数需要在后面进行指定；

3 10.0.0.8

2 10.0.0.9

2 10.0.0.7

1）实验数据：

[root@backup scripts]# vim www.txt

http://mp3.lc.com

http://post.lc.com

http://www.lc.com

2）先把重复的数据进行相邻排序，然后进行计数；

[root@backup scripts]# awk -F "/" '{print $3}' www.txt | sort | uniq -c

1 mp3.lc.com

2 post.lc.com

3 www.lc.com

3）排序：

#方法一，awk命令：

[root@backup scripts]# awk -F "/" '{print $3}' www.txt | sort | uniq -c | sort -rn | head -10

3 www.lc.com

2 post.lc.com

1 mp3.lc.com

#方法二，cut命令

[root@backup scripts]# cut -d "/" -f3 www.txt | sort | uniq -c | sort -rn | head -10

3 www.lc.com

2 post.lc.com

1 mp3.lc.com

猜你喜欢