The Three Musketeers of Shell System Programming - AWK

Table of contents

One: Introduction to AWK

1. Introduction to AWK tools

2. The basic format of AWK

3. Working principle of AWK

4. Common built-in variables (can be used directly)

Two: AWK example

1. Output text by line

2. Output text by field

 3. Invoke Shell commands through pipes and double quotes

 4.date command output time

5. View the memory usage ratio

6. Check the CPU usage ratio

7. Get odd and even rows

8.OFS output separator

 9. Array

10. Duplicate check

 11. Filter password failed


One: Introduction to AWK

1. Introduction to AWK tools

●AWK is a language for processing text files. It is a powerful text analysis tool. It is a programming language specially designed for text processing. It is also a line processing software, usually used for scanning, filtering, and statistical summary.

●Complex text operations can be implemented in non-interactive mode; data can come from standard input or pipes or files

●Compared with sed, which usually acts on a whole line, awk tends to divide a line into several fields for processing, because awk is quite suitable for small text data.

2. The basic format of AWK

1.awk [选项] ‘模式条件{操作}’ 文件1 文件2…
2.awk -f 脚本文件 文件1 文件2…
格式:awk关键字 选项 命令部分 ‘{xxx}’ 文件名

3. Working principle of AWK

● As mentioned earlier, the sed command is often used to process a whole line , while awk prefers to divide a line into multiple "fields" and then process them, and the field separator is a space or a tab key by default. The execution result of awk can print and display the field data through the print function.

●In the process of using the awk command, you can use the logical operators "&&" to represent "and", "||" to represent "or", "!" to represent " not"; you can also perform simple mathematical operations, such as +, -, *, /, %, and ^ represent addition, subtraction, multiplication, division, remainder, and power, respectively.

● awk is followed by two single quotation marks and curly braces { } to set the processing operation you want to perform on the data . Awk can process subsequent files, and can also read the standard output from the previous command.

4. Common built-in variables (can be used directly)

variable meaning
FS 列分割符. Specifies the field separator for each line of text, defaults to 空格或制表位. Same as "-F"
NF currently processed 行的字段个数.
NR currently processed 行的行号(序数).
$0 currently processed 行的整行内容.
$1 represent第一列的内容
$2 Represents the content of the second column
$n The nth field of the currently processed line
FILENAME The filename to be processed.
RS line separator

Note: When awk reads data from a file, it will cut the data into many records according to the definition of RS, and awk only reads one record at a time for processing. The default value is '\n'
in short: the data records are separated, the default is \n, that is, each line has one record.

Two: AWK example

1. Output text by line

#输出所有内容
[root@localhost ~]# awk '{print}' a
[root@localhost ~]# awk '{print $0}' a

#输出1-3行的内容
[root@localhost ~]# awk 'NR==1,NR==3{print}' a
[root@localhost ~]# awk '(NR>=1)&&(NR<=3){print}' a

#输出第一行或第三行
[root@localhost ~]# awk 'NR==1||NR==3{print}' a

#输出4-8行,或10行
[root@localhost ~]# awk '(NR>=4&&NR<=8)||NR==10 {print $0}' a
four
five
six
seven
eight
ten

[root@localhost ~]# awk '(NR%2)==1{print}' a    #输出所有奇数行
[root@localhost ~]# awk '(NR%2)==0{print}' a    #输出所有偶数行

#配合正则表达式使用输出
[root@localhost ~]# awk '/^root/{print}' /etc/passwd        #输出以root开头的行
[root@localhost ~]# awk '/nologin$/{print}' /etc/passwd     #输出以 nologin 结尾的行
[root@localhost ~]# awk 'NR!=10 {print $0}' a               #输出不是第10行的所有行

#统计以nologin为结尾的行数
[root@localhost ~]# grep -c "nologin$" /etc/passwd
[root@localhost ~]# awk '/nologin$/ {print $0}' /etc/passwd | wc -l
[root@localhost ~]# awk 'BEGIN {x=0}; /nologin$/ {x++}; END {print x}' /etc/passwd
[root@localhost ~]# awk 'BEGIN {x=0}; /nologin$/ {x++; print x, $0}; END {print x}' /etc/passwd

Note: The BEGIN mode means that before processing the specified text, the action specified in the BEGIN mode needs to be executed; awk processes the specified text, and then executes the action specified in the END mode. In the END {} statement block, it is often placed Enter statements such as printing results 

2. Output text by field

#输出以root开头的行的第一部分
[root@localhost ~]# awk -F: '/^root/ {print $1}' /etc/passwd

#输出以root开头的行的第一部分和第三部分
[root@localhost ~]# awk -F: '/^root/ {print $1,$3}' /etc/passwd

#输出以root开头的行的第一部分和第三部分,最后一部分
[root@localhost ~]# awk -F: '/^root/ {print $1,$3,$NF}' /etc/passwd

#输出第三部分不小于200的行
[root@localhost ~]# awk -F ":" '!($3<200){print}' /etc/passwd 

#以冒号为分隔符,第三部分大于等于1000。先处理完BEGIN的内容,再打印文本里面的内容
[root@localhost ~]# awk 'BEGIN {FS=":"};{if($3>=1000){print}}' /etc/passwd

#!表示为取反
[root@localhost ~]# awk -F: '!($3>8) {print $3,$1}' /etc/passwd
[root@localhost ~]# awk -F: 'BENGIN {FS=":"}; !($3>8) {print $3,$1}' /etc/passwd

#($3>$4)?$3:$4;三元运算符,如果第3个字段的值大于等于第4个字段的值,则把第3个字段的值赋给max,否则第4个字段的值赋给max
[root@localhost ~]# awk 'BEGIN {FS=":"};{if($3>=1000){print}}' /etc/passwd

#输出以冒号分隔且第7个字段中包含/bash的行的第1个字段,~意思为包含
[root@localhost ~]# awk -F ":" '$7~"/bash"{print $1}' /etc/passwd

#输出第1个字段中包含root且有7个字段的行的第1、2个字段
[root@localhost ~]# awk -F ":" '($1~"root")&&(NF==7){print $1,$2}' /etc/passwd

#输出第7个字段既不为/bin/bash,也不为/sbin/nologin的所有行
[root@localhost ~]# awk -F ":" '($7!="/bin/bash")&&($7!="/sbin/nologin"){print}' /etc/passwd

 3. Invoke Shell commands through pipes and double quotes

#输出行号和内容
[root@localhost ~]# awk '{print NR,$0}' a

#输出行号和内容
[root@localhost ~]# awk '{print NR,$0};END{print NR}' a

#以冒号为分隔符,输出内容和行号
[root@localhost ~]# echo $PATH | awk 'BEGIN{RS=":"}; {print NR,$0}'

#调用w命令,并用来统计在线用户数
[root@localhost ~]# awk 'BEGIN {n=0 ; while ("w" | getline) n++ ; {print n-2}}'

#查询用户名
[root@localhost ~]# awk 'BEGIN {"hostname" | getline ; {print $0}}'

 4.date command output time

[root@localhost ~]# date -d "1 month" +"%Y/%m/%d"          #下个月的今天
[root@localhost ~]# date -d "1 month" +"%Y/%m/01"          #下个月第一天
[root@localhost ~]# date +"%Y/%m/01"                       #当月第一天
[root@localhost ~]# date -d "1 month ago" +"%Y/%m/%d"      #一个月前的今天
[root@localhost ~]# date -d "1 day ago" +"%Y/%m/%d"        #1天前
[root@localhost ~]# date -d "-1 day " +"%Y/%m/%d"          #1天前
[root@localhost ~]# date -d "$(date +%Y%m01) -1 day" +%Y/%m/%d    #上个月的最后一天
[root@localhost ~]# date -d "$(date -d "1 month" +%Y%m01) -1 day" +%Y/%m/%d  #当月最后一天

[root@localhost ~]# date -d "$(cat /proc/uptime | awk -F. '{print $1}') second ago" +"%Y%m%d %H:%M:%S"         
#显示上次系统重启时间,等同于uptime;second ago为显示多少秒前的时间,+"%F %H:%M:%S"等同于+"%Y-%m-%d %H:%M:%S"的时间格式
/proc/uptime 第一列输出的是,系统启动到现在的时间(以秒为单位);第二列输出的是,系统空闲的时间(以秒为单位)
date -d "$(date -d"1 month" +"%Y%m01") -3 day" +"%Y%m%d"  当月倒数第三天
date +"%Y%m01"        当月第一天

5. View the memory usage ratio

#内存使用率占比
[root@localhost ~]# free -m | awk '/Mem/ {print $3/$2 * 100"%"}

#内存空闲率占比
[root@localhost ~]# free -m | awk '/Mem/ {print ($2-$3)/$2 * 100"%"}'

6. Check the CPU usage ratio

#cpu使用率占比
[root@localhost ~]# top -b -n1 | awk -F, '/%Cpu/ {print $4}' | awk '{print 100-$1"%"}'

7. Get odd and even rows

[root@localhost ~]# seq 10 | awk '{getline; print $0}'    #获取偶数行
[root@localhost ~]# seq 10 | awk '{print $0; getline}'    #获取奇数行
当getline左右无重定向符“<”或“|”时,awk首先读取到了第一行,就是1,然后getline,就得到了1下面的第二行,就是2,因为getline之后,awk会改变对应的NF,NR,FNR和$0等内部变量,所以此时的$0的值就不再是1,而是2了,然后将它打印出来。
当getline左右有重定向符“<”或“|”时,getline则作用于定向输入文件,由于该文件是刚打开,并没有被awk读入一行,只是getline读入,那么getline返回的是该文件的第一行,而不是隔行。

 8.OFS output separator

#OFS为指定分隔符
[root@localhost ~]# echo "A B C D" | awk '{OFS="|";print $0;$1=$1;print $0}'               
#以|为分隔符输出
[root@localhost ~]# echo "A B C D" | awk '{OFS="/";print $0;$1=$1;print $0}'
#以/为分隔符输出

 9. Array

#输出数组的下标位为1的数值
[root@localhost ~]# awk 'BEGIN{a[0]=1; a[1]=2; a[2]=3; print a[1]}' 
2
#输出数组的下标位为0的数值
[root@localhost ~]# awk 'BEGIN{a[0]=1; a[1]=2; a[2]=3; print a[0]}'
1
#也可以用字符串定义数组
[root@localhost ~]# awk 'BEGIN{a["abc"]=1; a["def"]=2; a["xyz"]=3; print a["abc"]}'
1
[root@localhost ~]# awk 'BEGIN{a["abc"]=1; a["def"]=2; a["xyz"]=3; print a["xyz"]}'
3
#输出数值和下标位
[root@localhost ~]# awk 'BEGIN{a[0]=1; a[1]=2; a[2]=3; for(i in a){print i,a[i]}}'
0 1
1 2
2 3

10. Duplicate check

#查询各种数值的重复次数
[root@localhost ~]# cat test.txt | awk '{a[$1]++};END{for(i in a){print i,a[i]}}'

 11. Filter password failed

awk '/Failed password/{ip[$11]++}END{for(i in ip){print i","ip[i],i}}' /var/log/secure | awk '$1>3{print $2}'
awk '/Invalid user/{print $10}' /var/log/secure | awk '{ip[$1]++}END{for(i in ip){print ip[i],i}}' | awk '$1>3{print $2}'

Guess you like

Origin blog.csdn.net/A1100886/article/details/130684780