awk principle and application
One, awk
1. Working principle:
Read text line by line, separated by space or tab by default, save the separated fields to built-in variables, and execute editing commands according to the mode or condition.
The sed command is often used to process a whole line, while awk tends to divide a line into multiple "fields" and then process it. The reading of awk information is also read line by line, and the execution result can be printed and displayed with the field data through the print function. In the process of using the awk command, you can use the logical operators "&&" to mean "and", "||" to mean "or", and "!" to mean "not"; you can also perform simple mathematical operations, such as +,- , *, /, %, ^ represent addition, subtraction, multiplication, division, remainder and power respectively.
2. Command format:
awk 选项 '模式或条件 {操作}' 文件 1 文件 2 …
awk -f 脚本文件 文件 1 文件 2 …
3. The common built-in variables of awk (can be used directly) are as follows:
- FS: Column separator. Specify the field separator for each line of text, the default is a space or a tab stop. Same as "-F"
- NF: the number of fields in the currently processed row
- NR: The row number (ordinal number) of the row currently being processed
- $0: The entire line content of the currently processed line
- $n: the nth field of the currently processed row (the nth column)
- FILENAME: The name of the file being processed
- RS: Line separator. When awk reads data from a file, it will cut the data into many records according to the definition of RS, while awk only reads one record at a time for processing. The default value is'\n'
Two, various output results of awk
1. Output text by line:
Print all lines
awk '{print}' file1.txt #输出所有内容
awk '{print $0}' file1.txt #输出所有内容
Print the content of lines 1-3
awk 'NR==1,NR==3{print}' file1.txt #输出第1~3行内容
或
awk '(NR>=1)&&(NR<=3){print}' file1.txt
Output the first and third lines
awk 'NR==1||NR==3{print}' file1.txt #输出第1行、第3行内容
awk 'NR==1;NR==3{print $0}' file1.txt
Output parity line
awk '(NR%2)==1{print}' testfile2 #输出所有奇数行的内容
awk '(NR%2)==0{print}' testfile2 #输出所有偶数行的内容
Print lines that start and end
awk '/^root/{print}' /etc/passwd #输出以 root 开头的行
awk '/nologin$/{print}' /etc/passwd #输出以 nologin 结尾的行
BEGIN mode means that before processing the specified text, you need to perform the specified action in the BEGIN mode; awk processes the specified text, and then executes the specified action in the END mode. The END{} statement block is often put in Print results and other statements
awk 'BEGIN {FS=":"};{if($3>=200){print}}' /etc/passwd #先处理完BEGIN的内容,awk再打印文本里面的内容
awk 'BEGEIN {X=0};/\/bin\/bash/{x++};END {print x}' /etc/passwd #匹配以/bin/bash结尾的行数等同于 grep -c /etc/passwd
2. Output text by field
Output the 3rd character in each line (separated by a space or tab stop)
awk -F ":" '{print $3}' /etc/passwd
Output the first and third fields in each line
awk -F ":" '{print $1,$3}' /etc/passwd
awk -F ":" '$3<5(print $1,$3}' /etc/passwd #输出第3字段不小于5的行 并打印该行的第1,第3个字段信息
awk -F ":" '!($3<200){print}' /etc/passwd #输出第3个字段的值不小于200的行
awk 'BEGIN {FS=":"};{if($3>=200){print}}' /etc/passwd #先处理完BEGIN的内容,再打印文本里面的内容
Print the entire line of the line number
awk -F ":" '{print NR,$0}' /etc/passwd #输出每行内容和行号,每处理完一条记录,NR值加1
($3>$4)?$3:$4 ternary operator, if the value of the third field is greater than the value of the fourth field, assign the value of the first field to max and print the value of max, otherwise the fourth Assign the value of the field to max
awk -F ":" '{max=($3>$4)?$1:$4;{print max}}' /etc/passwd
Print the line number and the content of the entire line
awk -F ":" '{print NR,$0}' /etc/passwd #输出每行内容和行号,每处理完一条记录,NR值加1
The line containing /bash in the 7th field prints the 1st field
awk -F ":" '$7~"/bash"{print $1}' /etc/passwd #输出以冒号分隔且第7个字段中包含/bash的行的第1个字段
The first and second fields of a row with 7 fields containing root in the first field
awk -F ":" '($1~"root")&&(NF==7){print $1,$2}' /etc/passwd #输出第1个字段中包含root且有7个字段的行的第1、2个字段
The seventh field is neither /bin/bash nor all lines of /sbin/nologin
awk -F ":" '($7!="/bin/bash")&&($7!="/sbin/nologin"){print}' /etc/passwd #输出第7个字段既不为/bin/bash,也不为/sbin/nologin的所有行
3. Invoke Shell commands through pipes and double quotes
echo $PATH | awk 'BEGIN{RS=":"};END{print NR}' #统计以冒号分隔的文本段落数,END{}语句块中,往往会放入打印结果等语句
Combine wc to count the number of rows
awk -F: '/bash$/{print | "wc -l"}' /etc/passwd #调用 wc -l 命令统计使用 bash 的用户个数,等同于 grep -c "bash$" /etc/passwd
View memory usage
free -m | awk '/Mem:/ {print int($3/($3+$4)*100)}' #查看当前内存使用百分比
View cpu idle rate
top -b -n 1 | grep Cpu | awk -F ',' '{print $4}' | awk '{print $1}' #查看当前CPU空闲率,(-b -n 1 表示只需要1次的输出结果)
date -d "$(awk -F "." '{print $1}' /proc/uptime) second ago" +"%F %H:%M:%S" #显示上次系统重启时间,等同于uptime;second ago为显示多少秒前的时间,+"%F %H:%M:%S"等同于+"%Y-%m-%d %H:%M:%S"的时间格式
awk 'BEGIN {while ("w" | getline) n++ ; {print n-2}"%"}' #调用w命令,并用来统计在线用户数
awk 'BEGIN {"hostname" | getline ; {print $0}}' #调用 hostname,并输出当前的主机名
When there is no redirection character "<" or "|" on the left and right of getline, getline acts on the current file and reads the first line of the current file to the variable var or $0 followed by it; it should be noted that because awk has already processed the getline before A line is read, so the return result of getline is interlaced. When there are redirection characters "<" or "|" on the left and right of getline, getline acts on the directional input file. Since the file is just opened and has not been read into a line by awk, it is only read by getline, then getline returns this The first line of the file, not every other line.
seq 10 | awk '{print $0; getline}' 获取奇数行
seq 10 | awk '{getline; print $0}' 获取偶数行
Get the local IP address