In-depth understanding of awk'! a[$0]++' de-duplication

In-depth understanding of awk'! a[$0]++' de-duplication

I recently reviewed shell scripts with my colleagues, awk'! a[$0]++' to repeat is not easy to understand? I have sorted it out, hoping to help everyone understand;

  • "!" That is not;
  • a[$0], use $0 as the data subscript to create an array a
  • a[$0]++, that is, assign a value to the array a, a[$0]+=1, pay attention to the difference between a++ and ++a, a++ is to output the value first and then add 1
  • Awk, when the pattern is 1, it will execute the action, at this time the action is empty and execute print $0
  • Variables that are not assigned in the shell default to char type, which is a null value. After ++, the array is defined as int type, and the initial value is 0.
[root@VM_39_7_centos ~]# a=0
[root@VM_39_7_centos ~]# echo  "$a"
0
[root@VM_39_7_centos ~]# echo  $((a++))
0
[root@VM_39_7_centos ~]# echo $a
1
[root@VM_39_7_centos ~]# awk '{print a[$0],!a[$0]++,a[$0],!a[$0],$0}' file
 1 1 0 111
 1 1 0 222
 1 1 0 555
 1 1 0 333
1 0 2 0 111
1 0 2 0 222
 1 1 0 444
2 0 3 0 222
1 0 2 0 555
[root@VM_39_7_centos ~]# more file
111
222
555
333
111
222
444
222
555
[root@VM_39_7_centos ~]#
[root@VM_39_7_centos ~]# awk '!a[$0]++' file
111
222
555
333
444
[root@VM_39_7_centos ~]# 

Explanation: The current value of a[$0]++ for the first time is 0, and then the result is 1 when 0 is returned. When the same value is encountered for the second time, a[$0] is 1, and the output a[$0] ++ is also 1 and if the return is 0, the result is not output, and so on;

Guess you like

Origin blog.csdn.net/qq_31555951/article/details/106616163