1. Get to know the awk command
2. The cut command
3. Where to record whereabouts (operation records) in linux

1. Get to know the awk command

AWK is a language for processing text files and a powerful text analysis tool.
The origin of the AWK name: the first characters of the Family Name of the three founders Alfred Aho, Peter Weinberger, and Brian Kernighan.

Comparison of three Musketeer shell commands:
grep is more suitable for simple search or matching text;
sed is more suitable for editing matched text;
awk is more suitable for formatting text, and the text is processed in more complicated format.

1.1 Working principle of awk

The awk command is the same as sed, and it reads and processes line by line.
Sed acts on the processing of a whole line, and awk divides a line into several fields for processing.
The awk workflow can be divided into three parts:
the code segment executed before reading the input file (identified by the BEGIN keyword); the code segment executed by the
main loop to execute the input file; the code segment
after reading the input file (identified by the END keyword).

Graphic: The brief processing flow of the awk command:
Insert picture description here

Execute the statements in the BEGIN {commands} statement block;
Read line 1 from file or stdin;
With or without pattern matching, if not, execute the statement in {};
If so, check whether the entire line matches the pattern. If it matches, execute the statement in {};
If there is no match, the statement in {} is not executed, and then the next line is read;
Repeat this process until all lines are read;
Execute the statements in the END {commands} statement block.

1.2 Syntax and parameter description

The complete syntax of the awk command

awk [option parameter] 'BEGIN {commands} pattern {commands} END {commands}' file

Option parameters:

-F specifies the input file fold separator, followed by a string or a regular expression, such as -F :.
-v Assign a user-defined variable.
-f Read the awk command from the script file.
-mf nnn and -mr nnn set intrinsic limits on nnn values, -mf option limits the maximum number of blocks allocated to nnn; -mr option limits the maximum number of records. These two functions are extensions of Bell Labs' awk and are not available in standard awk.
-W compact or --compat, -W traditional or --traditional run awk in compatibility mode. So gawk behaves exactly like standard awk, all awk extensions are ignored.
-W copyleft or --copyleft, -W copyright or --copyright Print short copyright information.
-W help or --help, -W usage or --usage Print all awk options and a short description of each option.
-W lint or --lint Print warnings for structures that cannot be ported to traditional unix platforms.
-W lint-old or --lint-old Print a warning about structures that cannot be ported to traditional unix platforms.
-W posix Turn on compatibility mode. However, it has the following restrictions and is not recognized: / x, function keyword, func, escape sequence, and when fs is a space, the new line is used as a domain separator; the operator and = cannot replace ^and =; fflush is invalid.
-W re-interval or --re-inerval allows the use of interval regular expressions, refer to (Posix character class in grep), such as bracket expression [[: alpha:]].
-W source program-text or --source program-text Use program-text as the source code, which can be mixed with the -f command.
-W version or --version Print the version of the bug report information.

Simple example:
Example:

# cat passwd|awk -F: 'BEGIN{print "############"}$3<100{print $1,$3,$6}END{print "@@@@@@@@@@@@@@"}'
############
root 0 /root
dbus 81 /
rpc 32 /var/cache/rpcbind
rpcuser 29 /var/lib/nfs
haldaemon 68 /
postfix 89 /var/spool/postfix
mysql 27 /var/lib/mysql
@@@@@@@@@@@@@@

1.3 Basic usage of awk

Usage one:

awk '/ pattern / {Action}' file # line matching statement awk '' can only use single quotes

Awk instructions must be enclosed in single quotes.
Awk's actions must be enclosed in curly braces.
The pattern can be a regular expression, a conditional expression, or a combination of both.
If the pattern is a regular expression, use the / delimiter.
Use a semicolon to separate multiple actions.
There are 2 types of separators in awk:
1. Input separator -F specifies
2. Output separator OFS =

Example:
Example 1: When there is only mode, it is equivalent to grep.

# cat passwd|awk '/liu/'
liu1:x:508:508::/home/liu1:/bin/bash
liu2:x:509:509::/home/liu2:/bin/bash

Example 2: When there is only an action, the action is executed directly.

# who|awk '{print $2}'
tty1
pts/0
# who
root     tty1         2016-06-24 23:20
root     pts/0        2016-06-24 23:21 (liupeng.lan)

Usage 2: -F specifies the delimiter.

awk -F #Specify the split character

Example 1: Output the first and seventh columns of the row beginning with h.

# cat passwd|awk -F: '/^h/{print $1,$7}'
haldaemon /sbin/nologin

Example 4: Display the first and seventh columns of rows that do not start with h

awk -F: '/^[^h]/{print $1,$7}' /etc/passwd

Example 5: Use the colon or the left slash as the delimiter to display the first and tenth columns.

awk -F'[:/]' '{print $1,$10}' /etc/passwd

Usage 3: Set variables.

awk -v #set variable

Example 1:

# name=haha
# echo|awk -v abc=$name '{print abc,$name}'
haha 
# echo|awk -v abc=$name '{print $name}'

# echo|awk -v abc=$name '{print abc}'
haha    --》引用awk变量无需加$

Example 2:

# name=haha;soft=xixi
# echo|awk -v abc=$name -v efg=$soft '{print abc,123}'
haha 123

Usage four: -f script.

awk -f script file name

1.4 Operators of the awk command

Regular expressions are consistent with bash.

Mathematical operations: +,-, *, /,%, ++,-
Logical relationship symbols: &&, ||,!
Comparison operators:>, <,> =,! =, <=, ==, ~ ( Matches regular expressions),! ~ (Does not match regular expressions)
text data expressions: == (exact match)
~ tilde indicates matching the pattern behind

Example 1: If you find a match for / pts / in $ 2, it will output $ 1.

# who | awk '$2 ~ /pts/{$1}'

Example 2: The output uid is a two-digit username and uid.

\ <Start with.
\> End with what.
.. in the regular expression represents two characters.

[root@liupeng ~]# cat passwd|awk -F: '$3 ~/\<..\>/{print $1,$3}'
uucp 10
operator 11
games 12
gopher 13
......
[root@liupeng ~]#

Example 3: Output numbers from 1 to 10 divisible by 5 or starting with 1.

[root@liupeng ~]# seq 10 | awk '$1 % 5 == 0 || $1 ~ /^1/{print $1}'
1
5
10
[root@liupeng ~]#

Example 4: Display the user name with uid greater than or equal to 50 and the home directory under / home while the shell ends with bash, uid, home directory, shell.

[root@liupeng ~]# cat passwd|awk -F: '$3>=50 && $6 ~/^\/home/ && $7 ~/bash/ {print $1,$3,$6,$7}'
mysql 496 /home/mysql /bin/bash
[root@liupeng ~]#

Example 5: Find out the user whose user name contains liu, and output the user name, uid and shell.

# cat passwd|awk -F':' '$1  ~/liu/{print $1,$3,$6}'
liu1 508 /home/liu1
liu2 509 /home/liu2
liu3 510 /home/liu3
liu4 511 /home/liu4
liu5 512 /home/liu5
liu 539 /home/liu

Example 6: Find out the user who contains liu in the / etc / passwd file and uses bash.

# cat passwd|awk -F':' '$1 ~/liu/&&$7 ~/bash/{print $1,$7}'
liu1 /bin/bash
liu2 /bin/bash
liu3 /bin/bash
liu4 /bin/bash
liu5 /bin/bash
liu /bin/bash

1.5 Built-in variables of the awk command

$ 0 means the entire line of text;
$ n means the nth data field in the text;
NF: number of field, which indicates how many fields are in a line;
NR: number of record, which indicates the currently processed line number;
FS: field separater, input separation symbol. The default is blank (spaces and tabs);
OFS: out field separater, which indicates the output separator, and replaces the line character with the specified symbol during output;
the number of
ARGC command line parameters; the position of the current file in the ARGIND command line (starting from 0 );
ARGV contains an array of command line parameters;
CONVFMT number conversion format (default value is% .6g) ENVIRON environment variable associative array;
ERRNO description of the last system error;
FIELDWIDTHS field width list (separated by space bar);
FILENAME current file Name;
FNR Line number of each file counted separately;
IGNORECASE if true, matching is ignored regardless of case;
OFMT number output format (default value is% .6g);
ORS output record separator (default value is a line break );
RLENGTH The length of the string matched by the match function;
RS record separator (the default is a newline character);
RSTART The first position of the string matched by the match function;
SUBSEP array subscript separator (default value) Yes / 034).

Example 1: Display the number of lines in the / etc / passwd file.

[root@liupeng ~]# cat /etc/passwd|awk 'BEGIN{i=0}{i++} END{print i}'
35
[root@liupeng ~]#

Analysis: Each line is taken, i ++. The final value of i is exactly the number of lines in the passwd file.

Example 2: Use the default separator to display the number of fields in each row.

[root@liupeng ~]# awk '{print NF}' /etc/passwd
1
1
1
1
......
[root@liupeng ~]#

(Because the separator is not specified here, the default separator is a space, and naturally the number of fields is 1.)

Use the colon as the delimiter to display the number of fields per line.

[root@liupeng ~]# awk -F: '{print NF}' /etc/passwd
7
7
7
......
[root@liupeng ~]#

(The separator is specified, which is 7)

Example 3: Display the first field and the last field of each line.

awk -F: '{print $1,$NF}' /etc/passwd

Example 4: Display the line number and content of each line.

[root@liupeng ~]# awk -F: '{print NR,$0}' /etc/passwd
1 root:x:0:0:root:/root:/bin/bash
2 bin:x:1:1:bin:/bin:/sbin/nologin
3 daemon:x:2:2:daemon:/sbin:/sbin/nologin
......
[root@liupeng ~]#

Example 5: Display the first and seventh columns, separated by ---

awk -F: 'BEGIN{OFS="---"}{print $1,$7}' /etc/passwd

Simple wording:

[root@liupeng ~]# awk -F: 'OFS="---"{print $1,$7}' /etc/passwd
root---/bin/bash
bin---/sbin/nologin
daemon---/sbin/nologin
......
[root@liupeng ~]#

Example 6: Display the user name and line number ending with bash, and finally display the total line number.

[root@liupeng ~]# awk 'BEGIN{FS=":"} /bash$/{print NR,$1} END{print NR}' /etc/passwd
1 root
35 mysql
35
[root@liupeng ~]#

Example 7: Display 3 to 5 lines of the file (line number, content).

[root@liupeng ~]# awk 'NR==3,NR==5{print NR,$0}' /etc/passwd
3 daemon:x:2:2:daemon:/sbin:/sbin/nologin
4 adm:x:3:4:adm:/var/adm:/sbin/nologin
5 lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
[root@liupeng ~]#

Display 4 or 7 lines (with line number and content).

[root@liupeng ~]# awk 'NR==4||NR==7{print NR,$0}' /etc/passwd
4 adm:x:3:4:adm:/var/adm:/sbin/nologin
7 shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
[root@liupeng ~]#

Example 8: Display the first 3 lines of the file (with line number and content).

[root@liupeng ~]# awk 'NR<=3{print NR,$0}' /etc/passwd
1 root:x:0:0:root:/root:/bin/bash
2 bin:x:1:1:bin:/bin:/sbin/nologin
3 daemon:x:2:2:daemon:/sbin:/sbin/nologin
[root@liupeng ~]#

Example 9: Display the first 10 lines and 30 to 40 lines of the file.

# awk 'NR<=10||NR>30&&NR<=40{print NR,$0}' /etc/passwd

Example 10: Display the line numbers and user names of lines 5 and 10 of the passwd file.

[root@liupeng ~]# cat passwd|awk -F: 'NR==5||NR==10{print NR,$1}'
5 lp
10 uucp
[root@liupeng ~]#

Example 11:
1. Analyze the difference between the following three commands and why.
①

#awk 'BEGIN{print NR}' /etc/passwd  （执行一次）
[root@liupeng ~]# awk 'BEGIN{print NR}' /etc/passwd
0
[root@liupeng ~]#

②

#awk '{print NR}' /etc/passwd    （执行N次，N为行数）
（一直到35（最后一行））

[root@liupeng ~]# awk '{print NR}' /etc/passwd 
1
2
3
......
35
[root@liupeng ~]#

③#awk 'END{print NR}' /etc/passwd （执行一次，结果为文件的行数）

[root@liupeng ~]# awk 'END{print NR}' /etc/passwd 
35
[root@liupeng ~]#

2. Analyze the execution results of the following commands
① #awk -F: '{print $NR}'/etc/passwd
(Analysis: Line 1 goes to the first field, Line 2 takes the second field, Line n goes to the nth field ...)
② #awk -F: '{print NR, NF, $1, $NF, $(NF-1)}' /etc/passwd
(Output each line Line number, field number, user name, last field, penultimate field)

[root@liupeng ~]# awk -F: '{print NR, NF, $1, $NF, $(NF-1)}' /etc/passwd
1 7 root /bin/bash /root
2 7 bin /sbin/nologin /bin
3 7 daemon /sbin/nologin /sbin
......
35 7 mysql /bin/bash /home/mysql
[root@liupeng ~]#

1.5.1 Awk built-in variable small exercise:

1. Use the NF variable to display the content of the penultimate column of the passwd file.

cat passwd|awk -F: '{print $(NF-1)}'

2. Display the user name on lines 5 to 10 in the passwd file.

cat passwd|awk -F: 'NR>=5&&NR<=10{print $1}'

3. Display the username of the 7th column in the passwd file that is not bash.

cat passwd|awk -F: '$7 ~/[^bash]$/{print $1}'

or:

cat passwd|awk -F: '$7 !~/bash/{print $1}'

4. Show that the line number in the passwd file is the line number and line ending with 5.

cat passwd|awk -F: 'NR%10==5{print NR,$0}'

or:

cat passwd|awk -F: 'NR ~/5$/{print NR,$0}'

5. Use ifconfig to display only ip (tr or cut commands cannot be used).

[root@liupeng ~]# ifconfig|awk -F: '/inet addr/{print $2}'|awk '{print $1}'
192.168.28.129
127.0.0.1
[root@liupeng ~]#

or:

ifconfig |sed -n '/inet addr/p'|awk -F[:" "] '{print $13}'

6. Use awk to display the inbound and outbound traffic (bytes) of eth0.

ifconfig eth0|awk -F'[: ]' '/RX bytes/{print $13,$19}'

ifconfig eth0|tr -s ' '|awk -F'[: ]' '/RX bytes/{print $4,$9 }'

7. Use the awk command to count the number of users beginning with r, and display the following effects.

Find results for
root
rpc
rtkit
rpcuser
4

[root@liupeng ~]# cat passwd|awk -F: 'BEGIN{print "查找结果";i=0}/^r/{print $1;i++}END{print i}'
查找结果
root
rpc
rtkit
rpcuser
4
[root@liupeng ~]#

1.6 Built-in functions of the awk command

Awk programming language has many built-in functions.

1.6.1 length function

Example 1: Use length to calculate the number of characters to check whether there is an empty password user.

awk -F: 'length($2)==0{print $1}' /etc/passwd /etc/shadow

Example 2: Display lines with more than 50 characters in the file.

awk 'length($0)>50{print NR,$0}' /etc/passwd

1.6.2 system function

Example 1: Use the system function to configure a password for the user in the file.

# cat list.txt
xixi 123
haha 456
hehe 789
# awk '{system("useradd $1");print $1 "is add"}' list.txt
# awk '{system("echo "$2"|passwd --stdin  "$1)}'  list.txt

Example 2: Use the awk system command to create a directory with the same name in / etc / passwd under / tmp / lp.

mkdir /tmp/lp/
awk -F: '{system("mkdir /tmp/lp/" $1)}' /etc/passwd

Example 3: Use one command to copy objects larger than 10K in the specified directory to / tmp (find and for are prohibited).

cp $(du -a  /boot | awk '$1>10240{print $2}') /tmp

1.7 Structured statements of the awk command

1.7.1 if statement

1. Single branch

[root@liupeng ~]# awk -F: '{if($1 ~ /\<...\>/)print $0}' /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
rpc:x:32:32:Rpcbind Daemon:/var/cache/rpcbind:/sbin/nologin
ntp:x:38:38::/etc/ntp:/sbin/nologin
gdm:x:42:42::/var/lib/gdm:/sbin/nologin
[root@liupeng ~]#

[root@liupeng ~]# awk -F: '{if($3 >= 500)print $1,$7}' /etc/passwd
nfsnobody /sbin/nologin
[root@liupeng ~]#

2. Double branch

awk -F: '{if($3 != 0) print $1 ; else print $3}' /etc/passwd

3. Multi-branch

awk -F: '{if($1=="root") print $1;else if($1=="ftp") print $2;else if($1=="mail") print $3;else print NR}' /etc/passwd

Example 2: Use awk's if multiple branch to determine the type of user, root is displayed as an administrator, uid is displayed as a system user at (0,500), uid is displayed as a normal user between [500,60000], and other than 60,000 are displayed as other user.

[root@liupeng ~]# awk -F: '{if($3==0) print $1,"管理员";else if($3>0 && $3<500) print $1,"系统用户";else if($3>=500 && $3 <= 60000) print $1,"普通用户";else print $1,"其它用户"}' /etc/passwd
root 管理员
bin 系统用户
......
mysql 系统用户
[root@liupeng ~]#

Supplement:
Three methods for obtaining uid greater than 500 and less than 10000:

# cat /etc/passwd|awk -F: '$3>500&&$3<10000{print $1,$3}'
# cat /etc/passwd|awk -F: '($3>500&&$3<10000){print $1,$3}'
# cat /etc/passwd|awk -F: '{if($3>500&&$3<10000)print $1,$3}'

Exercise: Monitor the disk partitions of multiple hosts. Once the utilization rate of any partition of a monitored host is greater than 80%, send an email alert to root.

#!/bin/bash
warn=10
ip=(10.10.10.2 10.10.10.3)
for i in "${ip[@]}"
do
   ssh root@$i df -Ph | tr -s " "|awk -v w=10 -F "[ %]" '/^\/dev/{if($5>w) print  $1,$5"% useage is over 10%"}'>
alert
   if [ -s alert ]
   then
       sed -i "1i $i" alert &&  mail -s "$i hd usage" root < alert
   fi
done

Exercise: Check the / var / log / secure log file. If a host has failed to connect to the server with the root user's ssh service more than 10 times (10 times must use variables), add this IP address to the /etc/hosts.deny file and refuse For its access, if this IP already exists, there is no need to add it to the /etc/hosts.deny file repeatedly (requires the use of awk statements for character filtering, size judgment and variable assignment, and prohibits the use of echo, sed, grep, cut, tr commands)

#/bin/bash
awk '/Failed password for root/{print $(NF-3)}' /var/log/secure|sort -nr| uniq -c > test        --》sort 按降序排序，uniq -c统计次数
NUM=10
IP=$(awk '$1>num {print $2}' num=$NUM test)    --》$1就是统计好的次数
for i in $IP
 do
   DENY=$(awk '$2==var {print $2}' var=$i /etc/hosts.deny)
   if [[ -z $DENY ]]
     then
      awk '$2==var {print "sshd: "$2}' var=$i test >> /etc/hosts.deny
      awk '$2==var {print "错误次数"$1,"拒绝"$2"访问"}' var=$i test
   else
      awk '$2==var {print "已经拒绝"$2"访问"}' var=$i test
   fi
done

1.7.2 Awk accumulation of rows and columns

1.awk performs column summation
Count the total size of files ending in .conf in the / etc directory.

[root@liupeng ~]# find /etc/ -type f -name "*.conf" |xargs ls -l | awk '{sum+=$5} END{print sum}'
522341
[root@liupeng ~]#

If you want to match the first column to accumulate, you need to use awk array and for loop (difficult).

cat xferlog | awk '{print $7,$8}' | sort -n >/lianxi/123.txt
awk '{a[$1]+=$2}END{for(i in a) print i,a[i]}'/lianxi/123.txt | sort -rn -k2

a [$ 1] = a [$ 1] + $ 2
[xferlog is a log file on the ftp server! 】
【The array in awk supports the associative array in the shell! ! 】
【Sort -rn -k2 is to count the most downloaded ip and sort]

3 Awk performs row summation.
Example 1: Find the cumulative sum of 1 ~ 5.

# echo 1 2 3 4 5 | awk '{for(i=1;i<=NF;i++) sum+=$i; print sum}'
15

Example 2: Find the cumulative sum of 1 ~ 100.

# seq -s ' ' 100 | awk '{for(i=1;i<=NF;i++) sum+=$i; print sum}'
5050

1.8 How to understand that the array in awk is an associative array:

1.8.1 How to use array to store data in awk?

1. Store all users in / etc / passwd in the user array.

cat /etc/passwd|awk -F: '{user[$1]}'

Analysis: At this time, there is no value in user [root], but the subscript or index becomes user [user name].

cat /etc/passwd|awk -F: '{user[$1]=$3}'

Analysis: At this time, the value of $ 3 (that is, uid) is assigned to the user [$ 1] array.
In this way, the user is associated with the uid corresponding to the user, the user name is a subscript keyword, and the uid is the value corresponding to the array element.

1.8.2 How to get data from array in awk?

Take out all the values in the user array:

# cat /etc/passwd|awk -F: '{user[$1]=$3}END{for(i in user)print user[i],i}'
42 gdm
38 ntp
32 rpc
......

1.8.3 Understanding of associative arrays in awk (difficult points):

{a[$1]+=$2}

The field corresponding to $ 1 is used as the subscript, and the field corresponding to $ 2 is assigned to the element whose subscript is the field corresponding to $ 1. awk is executed every time a line is read, and the field corresponding to $ 1 is again used as a subscript, and a [$ 1] = a [$ 1] + $ 2 is executed, and the value of a [$ 1] in the previous line is added to the value of $ 2 in the second line To achieve the cumulative effect. The field corresponding to $ 1 is still that field, but the values have accumulated.

Applicable scenarios: a scenario where one column is unchanged and the content of another column is different.

Small exercise:
Count how much money each person spent in total, and sort in descending order of the total amount.

money.txt：
feng 100
feng 200
feng 360
li 100
li 150
zhang 90
zhang 88

# cat money.txt|awk '{username[$1]+=$2}END{for(i in username)print i,username[i]}'|sort -nr -k2
feng 660
li 250
zhang 178

1.8.4 if judgment of associative array in awk

Example: Store the uid of all users in / etc / passwd into an array. If the user name entered by the user is in the array, the uid corresponding to this user is output. User name is used as subscript; uid is used as element value.

cat vim awk_user.sh:
[root@liupeng lp]# cat awk_user.sh 
#!/bin/bash
read -p "Please input the username:" u_name
cat /etc/passwd|awk -v U_name=$u_name -F: '{user[$1]=$3}END{if (U_name in user)print user[U_name]}'

[root@liupeng lp]#

[root@liupeng lp]# sh awk_user.sh 
Please input the username:root
0
[root@liupeng lp]# sh awk_user.sh 
Please input the username:zhao
[root@liupeng lp]#

2. The cut command

The cut command cuts bytes, characters, and fields from each line of the file and writes these bytes, characters, and fields to standard output.
If you do not specify the File parameter, the cut command will read standard input. One of the -b, -c, or -f flag must be specified.
Cut parameters:

-b ：以字节为单位进行分割。这些字节位置将忽略多字节字符边界，除非也指定了 -n 标志。
-c ：以字符为单位进行分割。
-d ：自定义分隔符，默认为制表符。
-f ：与-d一起使用，指定显示哪个区域。
-n ：取消分割多字节字符。仅和 -b 标志一起使用。如果字符的最后一个字节落在由 -b 标志的 List 参数指示的范围之内，该字符将被写出；否则，该字符将被排除。

3. Where to record whereabouts (operation records) in linux

①history -c
② ~ / .bash_history
③ / var / log / secure
④ / var / log / lastlog
⑤ / var / log / wtmp

The shell command three swordsman's awk command detailed explanation, cut command, record whereabouts in linux (operation record)