This introduction uniq command, uniq is linux command pipeline in a family, its main function is to remove duplicates.
Before introducing the uniq command, let's create a new file in the following cases need to use /tmp/uniq.txt, reads as follows
By default, uniq only retrieve adjacent to duplicate data so heavy. In /tmp/uniq.txt Although "onmpw web site" has three, but one of the other two are not adjacent, so just go to a heavy, empathy "error php function" is also the case.
In view of the above retrieval mechanism, so that under normal circumstances to uniq and sort commands used together.
# sort 1.txt | uniq
alpha css web
cat linux command
error php function
hello world
onmpw web site
recruise page site
repeat no data
wello web site
Now look is not all duplicates have been through the deduplication process.
Well, after a small test chopper, let's get started on the option uniq command of a brief introduction.
-c number of repetitions of each row of data statistical
sort 1.txt | uniq -c
1 alpha css web 1 cat linux command 2 error php function 1 hello world 3 onmpw web site 1 recruise page site 1 repeat no data 1 wello web site
We see the "error php function" appears twice, "onmpw web site" appears three times. The rest are not duplicates it to 1.
-i ignore case
1.txt add a row of data "Error PHP function"
cat 1.txt
alpha css web
cat linux command
error php function
hello world
onmpw web site
onmpw web site
wello web site
Error PHP function
recruise page site
error php function
repeat no data
onmpw web site
sort 1.txt | uniq –c 1 alpha css web 1 cat linux command 2 error php function 1 Error PHP function 1 hello world 3 onmpw web site 1 recruise page site 1 repeat no data 1 wello web site
We look at the results, uniq default is case-sensitive. Use -i can ignore capitalization issues
sort 1.txt | uniq –c –i
1 alpha css web 1 cat linux command 3 error php function 1 hello world 3 onmpw web site 1 recruise page site 1 repeat no data 1 wello web site
Now look at is not the case has been ignored.
-u output only data without duplication
sort 1.txt | uniq –iu alpha css web cat linux command hello world recruise page site repeat no data wello web site
That did not, the result of "error php function" and "onmpw web site" have not been output.
-w N represents the start retrieving only the first character of N characters to re-sentence.
sort 1.txt | uniq –iw 2 alpha css web cat linux command error php function hello world onmpw web site recruise page site wello web site
Here we let uniq only the first two characters to search, repeat and former recruit two characters are re, so these two lines also considered to be repeated.
-f N represents the first N fields skip, repeat start retrieving data from the first N + 1 fields. A tab or a space character as the delimiter.
sort 1.txt | uniq –icf 2 1 alpha css web 1 cat linux command 3 error php function 1 hello world 4 onmpw web site 1 repeat no data 1 wello web site
We can see in the results, which is slightly over the previous two fields, from the beginning of the third field sentenced to heavy. The same "recruise page site" and "onmpw web site" in the third field, it is considered to be the same data. But as we see, "wello web site" and "onmpw web site" not only the same as the third field, the second is the same. So why it is not included in the "onmpw web site" duplicate data in it. For this problem to be back in front of that, uniq detected only adjacent data is a duplicate.
To solve this problem also needs to proceed on the sort order. Remember the -k option to sort the command of it, yes, we will use it to solve.
sort –k 2 1.txt | uniq –icf 2
1 alpha css web 1 cat linux command 1 repeat no data 1 recruise page site 3 error php function 4 onmpw web site 1 hello world
We see, is not resolved.
-s N expressed skip the first N characters, this option is not on the example we cite here, and this option -f N usage almost. Just skip the front of the -f N is N fields; -s is to skip the first N characters.
-d only the data of the first stripe are duplicates.
sort 1.txt | uniq -idw 2 repeat no data error php function onmpw web site
Only the results of these three. Why "repeat no data" of this data, where attention -w 2 of the application.
-D for duplicates of all output
sort 1.txt | uniq –iDw 2 repeat no data recruise page site error php function error php function Error PHP function onmpw web site onmpw web site onmpw web site
Well, all the usual options uniq command on the already finished are introduced. About uniq More detailed information can use the command info uniq.
I hope this article to be helpful.