Use of awk command

awk command

format

awk [-F field separator]'commands' filename
defaults to space or tab as the separator

With a colon as a separator, you must specify the -F option, such as: awk -F:'commands' filename

-F fs specifies fs as the separator of the input line, the default separator of the awk command is a space or a tab

-v var=val Before executing the process, set a variable var and give the device the initial value of val

The complete format of the awk command: awd [-F field-separator]'matching rule {execute command}' filename

The entire commands are enclosed in single quotation marks (''), and the execution command part needs to be enclosed in braces ({})

When the awk program is executed, if no execution command is specified, the matching line will be output by default; if no matching rule is specified, all lines in the text will be matched by default

Matching rules

The awk reserved word BEGIN is executed at the beginning of the awk program and before reading any data. The action after BEGIN is executed only once at the beginning of the program

The awk reserved word END is executed when the awk program has processed all data and is about to end. The action after END is executed only once at the end of the program

Relational operator> greater than
<less than
>= greater than or equal to
<= less than or equal to
== equal to. Used to judge whether two values ​​are equal. If it is to assign a value to a variable, use "="
!= Not equal to
match expression ~ (match) value ~ /regexp/ If value matches /regexp/, return true
! ~ (not match) value !~ /regexp/ if If the value does not match /regexp/, it will return true.
Regular expression/Regular expression/ If characters can be written in "//", regular expressions can also be supported, such as: /root/ means matching lines containing root.
Logical operator && logical AND
|| logical OR

[root@haha ~]# awk -F: '$7~ /bash$/ {print $1}' /etc/passwd
root
haha
##以:为分隔符查看/etc/passwd文件,如果第七列匹配以/bash结尾的字符串,就输出对应行的第一列

Data field variable

$0 represents the entire text line
$1 represents the first data field in the
text line $2 represents the second data field in the
text line $n represents the nth data field in the text line

Display the information of the first and sixth fields in the passwd file
awk -F:'{print $1"\t"$6}' /etc/passwd

Excuting an order

The execution command of awk is specified in braces {}. Most of the actions are used for printing (ie print instructions)
Action:
formatted output (print);
flow control statements (if, while, for, etc.);

(5) BEGIN and END in
awk awk [options]'BEGIN{ print “start”} matching rules {commands} END{ print “end” }'filename

BEGIN:
BEGIN execution timing is "at the beginning of the awk program, before reading any data". Once the action after BEGIN is executed once, when awk starts to read data from the file, the BEGIN condition no longer holds, so the action defined by BEGIN can only be executed once. Through the BEGIN start block we can use to set variables, set the title

[root@haha ~]# awk -F: 'BEGIN {print "username and loginshell"} $3>=500 {print $1 "\t" $7}' /etc/passwd
#这里定义了两个动作
   #第一个动作使用BEGIN条件,所以会在读入文件数据前打印"print username and loginshell" (只会执行一次)
   #第二个动作会在条件满足时打印文件的第1个字段和第7个字段

END:
END is also a reserved word of awk, but it is just the opposite of BEGIN. END is executed when the awk program has processed all data and is about to end. The actions after END are executed only once at the end of the program.

[root@haha ~]# awk -F: 'BEGIN {print "print username and loginshell"} $3>=500 {print $ "\t" $7} END {print "The end"}' /etc/passwd
##输出结尾输入"The End",这并不是文档本身的内容,而且只会执行一次

awk variables

awk custom variables

awk -v varname=value variable names are case sensitive

[root@haha ~]# awk -v var=”统计用户数” -v count=0 -F: ‘BEGIN{
    
    print var} {
    
    count++} END{
    
    print “user count is:,count}/etc/passwd

Define directly in the program, use semicolons when defining, print variables as before, separated by commas, no need to use dollar signs (defined in BENGIN)

[root@haha ~]# awk 'BEGIN{name="haha";age="26";sex="man";print name,age,sex}'
haha 26 man

Command line definition

[root@haha ~]# var=“ha ha”
[root@haha ~]# awk -v var1=“$var” ‘BEGIN{
    
    print var1}
##不在awk中定义,是打印不出来的
haha

Awk built-in variables (predefined variables)

$n The nth field of the current record (current line). For example, n is 1 for the first field, and n is 2 for the second field.
$0 This variable contains the text content of the current line during execution.
FILENAME The name of the current input file
FS field separator (the default is any space)
NF represents the number of fields, which corresponds to the current number of fields
during execution, and NR represents the number of records, which corresponds to the current line number during execution.
FNR The line number counted separately for each file

[root@haha ~]# cat test.txt
haha 25 man
hehe 25 man
heihei 25 man
[root@haha ~]# cat test1.txt
xixi 25 man
hengheng 25 man

NF: print out how many columns are in each row:

[root@haha ~]# awk ‘{
    
    print NF}’ test.txt
3
3
3

NF: Quote the first and last column

[root@haha ~]# awk ‘{
    
    print $(NF-2),$NF}’test.txt
haha man
hehe man
heihei man

NR: Line number per line

[root@haha ~]# awk ‘{
    
    print NR“),$0}’ test.txt
1)haha 25 man
2)hehe 25 man
3)heihei 25 man

FNR: awk supports multi-file scan, uses NR to connect the number, and uses FNR to separate the number

[root@haha ~]# awk ‘print NR“),$0’ test.txt test1.txt
1)haha 25 man
2)hehe 25 man
3)heihei 25 man
4)xixi 25 man
5)hengheng 25 man
[root@haha ~]# awk ‘print FNR“),$0’test.txt test1.txt
1)haha 25 man
2)hehe 25 man
3)heihei 25 man
1)xixi 25 man
2)hengheng 25 man

FS: Separator
Common way of writing: -F: -F, -F[ /,]

[root@haha ~]# echo “a b c d” | awk ‘{
    
    print $2}
b
[root@haha ~]# echo “a,b,c,d” | awk -F,{
    
    print $2}
b
[root@haha ~]# echo “a,b,c,d” | awk -F“,” ‘{
    
    print $2}
b
[root@haha ~]# echo “a b,c%d” | awk -F“[ ,%]” ‘{
    
    print $2}
b
[root@haha ~]# echo “a   b,,c%%%d” | awk -F“[ ,%]+” ‘{
    
    print $2}
b

Example:

[root@haha ~]# ls -l | awk 'BEGIN{size=0}{size+=$5}END{print "Total is:"size/1024/1024"MB"}'
统计某个目录下的文件占用字节数,以MB显示

[root@haha ~]# ls -l /etc/ | awk 'BEGIN{sum=0}/^[^d]/{print $9  "\t"  $5;sum+=$5}END{print "Total size:"sum/1024/1024"MB"}'
统计某个目录下每个文件的大小及其总和,但要排除子目录

[root@haha ~]# ifconfig ens33 | grep “inet\>| awk ‘{
    
    print $2}192.168.13.14
提取IP地址

[root@haha ~]# echo “haha heihei hehe xixi” | awk ‘{
    
    print $3}
hehe
[root@haha ~]# echo “haha heihei hehe xixi” | awk ‘{
    
    print $(NF-1)}
hehe
关系运算

[root@haha ~]# awk  -F: ‘$3<1000{
    
    print $1<======>”$NF}/etc/passwd
打印passwd文件中用户UID小于10的用户名和它登录使用的登录shell

awk -F: ‘$3>=1000 && $NF==/bin/bash”{
    
    print $1,$NF}/etc/passwd
打印系统中能够正常登录的普通用户

[root@haha ~]# free -m | grep -i "mem" | awk '{printf "%.2f\n",$3/$2*100}'
统计当前内存的使用率
##printf命令按照指定的格式显示输出。
##printf命令格式:printf 	“打印格式”  显示内容
%2.f就是打印格式,表保留小数后两位,f代表浮点数

awk control statement

if statement

if (condition) print
if(condition){print}else{print}
if(condition){print}else if(condition){print}else{print}
##else if == elif
if statement use environment: to awk Say, to make a conditional judgment on the entire row or a field obtained

[root@haha ~]# awk -F: '{if($3<1000) {print "system_user="$1} else {print "ordinary_user="$1}}' /etc/passwd
##如果uid小于1000,就打印系统用户:$1,否则打印普通用户:$1

[root@haha ~]# awk -F: '{if($3==0){print "Administrator:"$1}else if($3<1000){print "system_user:"$1}else{print "ordinary_user:"$1}}' /etc/passwd
##如果uid=0就打印系统管理员root,如果uid<1000就打印程序用户,否则打印普通用户

[root@haha ~]# df -hT | awk -F"[ %]+" '/^\/dev/{if($6>25){print $1"\t"$6"%"}}'
##统计分区使用率,输出超过25%的分区信息

while loop

Statement: while (condition) {statement} If the condition is "true" to enter the loop, the condition is "false" to exit the loop.
When performing calculation results statistics, you can use the symbol +=, which means that we can assign the added result to the symbol For the variable on the left of the field, write which field on the right side of the symbol for which field operation, such as Total+=$3
while statement use environment: in awk, process multiple fields in a line of content one by one

[root@haha ~]# awk '{total=total+$5}END{print total}' test.txt
[root@haha ~]# awk '{total+=$5}END{print total}' test.txt
总数=总数+第五列数据(列数据的和)
END打印总数

[root@haha ~]# awk '{total=0;i=1;while(i<=NF){total+=$i;i++}; print total}' test.txt
##总数为0,赋值1给变量i,当i小于等于字段域时,总数加上i对应域的数据,i递增,循环结束打印总数(行数据的和)

[root@haha ~]# awk 'NR>1{total=0;i=5;while(i<=$(NF-2)){total+=$i;i++};print $1,total}' test.txt
##匹配规则NR>1,执行命令:赋值total,赋值i,当i小于等于(NF-2)时,总数=总数加上i对应列的数据,然后打印出来总数

for loop

Syntax: for (variable assignment; condition; iteration) {command}
Syntax: for (var in array) {statement}

[root@haha ~]# awk '{total=0;for(i=1;i<=NF;i++){total+=$i}; print total}' test.txt
##总数为0,i=1,当i小于等于字段个数时,总数=总数和i对应字段的数据相加,i递增,结束for循环后,打印总数

Even number (next)

Print odd or even lines

awk'NR%2==0{print NR,$0}' /root/test.txt The
line modulo 2 and the remainder is 0, which is an even-numbered line. The
line modulus is 2 and the remainder is 1, which is an odd-numbered line.

Array

An array is a collection containing a series of elements.
Format:
zex[1]=”haha”
zex[2]=”heihei”

zex: is the name of the array
[1], [2]: is the subscript of the array element, which can be understood as the first element of the array, the second element of the array
"haha", "heihei": element content

Array variable assignment format

var[index]=element
description: var====> is the name of the array
index===> is the subscript of the associative array
element===> is the element value of the array

Numbers as array index:
array[1]=”haha”
array[2]=”heihei”

String as array subscript:
array["first"]="www"
array["last"]="name"
When referencing an array variable, it must include its subscript value, and then extract it by subscript value Its corresponding element value

awk writing format

[root@haha ~]# awk 'BEGIN{ \
> test["a"]="haha" \
> test["b"]="heihei" \
> test["c"]="hehe" \
> print test["b"] \
> }'
显示结果:
heihei
也可以写成:
#awk 'BEGIN{test["a"]="haha";test["b"]="heihei";test["c"]="hehe";print test["b"]}'

Traverse the elements in the array

for(var in array){print array[var]}

[root@haha ~]# awk ‘BEGIN{
    
     \
>test[“a”]=“haha” \
>test[“b”]=“hehe” \
>for(i in test) \
>{
    
    print “index:”i,-----------value:”test[i] \
>}
index:a ----------value:haha
index:b ----------value:hehe

Delete array elements

Add delete array[subscript] to the curly brace command

[root@haha ~]# awk 'BEGIN{test["a"]="haha";test["b"]="hehe";test["c"]="heihei";for(i in test){print i,test[i]};print "+++++++++++++++";delete test["b"];for(i in test){print i,test[i]}}'

Array application

array[“index”]++ 每循环一次这个索引所对应的元素值加1(初始值默认是0

[root@haha ~]# netstat -antpu | awk'/^tcp\>/{
    
    test[$6]++}END{
    
    for(i in tes
t){
    
    print i,test[i]}}'
##查看tcp连接LIsten和Estableshed出现的次数

[root@haha ~]# awk -F“[/]+” ‘{
    
    print $2}’ test.txt | sort | uniq -c
##查看相同字符串出现的次数

[root@haha ~]# awk ‘{
    
    test[$1]++}END{
    
    for(i in test){
    
    print i,test[i]}}’ test.txt
##查看相同字符串出现的次数

Guess you like

Origin blog.csdn.net/weixin_52441468/article/details/112642630