Usage: uniq [option] ... [file]
screened from adjacent input file, or standard input matching lines and written to the output file or standard output.
Matching lines will be consolidated at the first occurrence is without any options.
Long option must be used when the parameter for short options too necessary use.
-c, --count before each row represents a respective row plus mesh prefix number of occurrences
-d, --repeated duplicate rows output only, or more than 2 times 2 times.
-D, --all-repeated [= delimit -method to display all the duplicate rows
delimit-method = {none (default ), prepend, separate}
empty behavior limits
-f, --skip-fields = N skipped when compared first N columns
-i, --ignore-case in a case-insensitive comparison when
-s, --skip-chars = N skipped when compared to the first N characters
-u, --unique show only a single row
-z , --zero-terminated with '\ 0' as a line terminator, instead of the new wrapping
-w, --check-chars = N the content of each subsequent row is not a control character N
--help display this help and exit
--version display version information and exit
If the domain for the first null character (usually including spaces and tabs), then a non-null character, null character before the character field will be skipped.
Tip: uniq does not check for duplicate rows unless they are adjacent rows.
If you would like to enter sort, uniq not use the "sort -u".
test:
No arguments only on the same row adjacent to heavy content
[root@bqh-118 ~]# cat qc.log
192.168.43.117
192.168.43.119
192.168.43.118
192.168.43.118
192.168.43.117
192.168.43.117
192.168.43.119
192.168.43.110
[root@bqh-118 ~]# uniq qc.log
192.168.43.117
192.168.43.119
192.168.43.118
192.168.43.117
192.168.43.119
192.168.43.110
Let's sort through duplicate adjacent lines:
[root@bqh-118 ~]# sort qc.log
192.168.43.110
192.168.43.117
192.168.43.117
192.168.43.117
192.168.43.118
192.168.43.118
192.168.43.119
192.168.43.119
uniq come with sort weight:
[root@bqh-118 ~]# sort qc.log |uniq
192.168.43.110
192.168.43.117
192.168.43.118
192.168.43.119
[root@bqh-118 ~]# sort -u qc.log
192.168.43.110
192.168.43.117
192.168.43.118
192.168.43.119
Of course, we can also go through sort -u file to achieve weight
Deduplication Count:
[root@bqh-118 ~]# sort qc.log |uniq -c
1 192.168.43.110
3 192.168.43.117
2 192.168.43.118
2 192.168.43.119
[root@bqh-118 ~]# sort qc.log
192.168.43.110
192.168.43.117
192.168.43.117
192.168.43.117
192.168.43.118
192.168.43.118
192.168.43.119
192.168.43.119
Check duplicate entries:
[root@bqh-118 ~]# sort qc.log |uniq -d
192.168.43.117
192.168.43.118
192.168.43.119
View all duplicate entries:
[root@bqh-118 ~]# sort qc.log |uniq -D
192.168.43.117
192.168.43.117
192.168.43.117
192.168.43.118
192.168.43.118
192.168.43.119
192.168.43.119
Case-insensitive, remove duplicate entries:
[root@bqh-118 ~]# cat qc1.log
apple
APple
BANAN
banan
grape
orange
Orange
bqh jyw
bqh1 jyw
[root@bqh-118 ~]# uniq -i qc1.log
apple
BANAN
grape
orange
bqh jyw
bqh1 jyw
Skip the first column:
[root@bqh-118 ~]# uniq -f1 qc1.log
apple
bqh jyw
bqh1 jyw
Skip the first character of each line:
[root@bqh-118 ~]# uniq -s1 qc1.log
apple
APple
BANAN
banan
grape
orange
bqh jyw
bqh1 jyw
Case: deal with it qc2.log contents of the file, the domain name will be taken out and counted the sorting process according to the domain name.
[root@bqh-118 ~]# cat qc2.log
http://www.baidu.com
http://www.xiaobai.com
http://www.etiantian.org
http://www.jyw.com
http://www.jyw.com
http://www.xiaobai.com
http://www.etiantian.org
http://www.jyw.com
http://www.baidu.com
http://www.baidu.com
http://www.jyw.com
http://www.etiantian.org
[root@bqh-118 ~]# awk -F / '{print $3}' qc2.log|sort|uniq -c|sort -r
4 www.jyw.com
3 www.etiantian.org
3 www.baidu.com
2 www.xiaobai.com
Method two: cut method
[root@bqh-118 ~]# cut -d / -f3 qc2.log |sort -r|uniq -c
2 www.xiaobai.com
4 www.jyw.com
3 www.etiantian.org
3 www.baidu.com
[root@bqh-118 ~]# cut -d / -f3 qc2.log |sort -r|uniq -c|sort -r
4 www.jyw.com
3 www.etiantian.org
3 www.baidu.com
2 www.xiaobai.com
Of course, there are other methods, here on a brief commonly used method.