shell programming (seven)

A, awk introduced
1.awk overview
2.awk can be doing?
Two, awk use
1. Command-line mode == ==
1) syntax structure
2) described common options
3) == == named section describes = = ==
2 mode using scripts
written in 1) script
2) script execution
three internal variables relevant awk
1, conventional built-in variable == == example
2, the built-in variable delimiter example
four, awk works
five, using an intake awk order
1. format print output and the printf
2.awk variable definition
3.awk used in BEGIN ... END
1) illustrates a
2) illustrates a 2
4.awk regular and comprehensive use of
1) illustrates
5. Workshop
Scripting 6.awk of
1) flow control statements
2) loop
7.awk arithmetic
six, awk statistics cases
of various types 1, the statistical system shell
2, the state statistics site visit
3, visit the Web site statistics for each IP number
4, site statistics log PV amount

A, awk introduced

1. awk Overview

  • awk is a programming language == == == text and mainly used for data processing at == linux / unix, is a tool in the linux / unix. Standard input data can come from one or more files, or other output commands.
  • text and data processing of awk way: == == progressive scan files by default from the first row to the last row, look for the line == == a specific pattern matching, and you want to do on these lines .
  • awk representing the first letter of their last name. Because it was written by three people, namely Alfred Aho, Brian Kernighan, Peter Weinberger.
  • gawk is the GNU version of awk, it provides some extensions Bell Laboratories and the GNU.

  • awk described below is an example of the GNU gawk in linux system Yiba link to gawk awk, awk so all of the following to be introduced.

2. awk can be doing?

  1. awk == == for handling files and data is a class of tools under unix, is also a programming language
  2. == == statistical data can be used, such as site visits, IP, and so the amount of access
  3. Conditional support, support for and while loops

Two, awk use

1. Command-line mode == ==

1) grammatical structure

awk 选项 '命令部分' 文件名

特别说明:
引用shell变量需用双引号引起

2) Common Options Introduction

  • == - F == split symbol definition field, the default delimiter is a space == ==
  • -v define variables and assign

3) == == named section describes == ==

  • Regular expressions, Address Locator
'/root/{awk语句}'             sed中: '/root/p'
'NR==1,NR==5{awk语句}'            sed中: '1,5p'
'/^root/,/^ftp/{awk语句}'     sed中:'/^root/,/^ftp/p'
  • {awk statement. 1 ==; == awk statement 2 ==; == ...}
'{print $0;print $1}'       sed中:'p'
'NR==5{print $0}'               sed中:'5p'
注:awk命令语句间用分号间隔
  • BEGIN...END....
'BEGIN{awk语句};{处理中};END{awk语句}'
'BEGIN{awk语句};{处理中}'
'{处理中};END{awk语句}'

2. Use the script mode

1) Scripting

#!/bin/awk -f       定义魔法字符
以下是awk引号里的命令清单,不要用引号保护命令,多个命令用分号间隔
BEGIN{FS=":"}
NR==1,NR==3{print $1"\t"$NF}
...

2) script execution

方法1:
awk 选项 -f awk的脚本文件  要处理的文本文件
awk -f awk.sh filename

sed -f sed.sh -i filename

方法2:
./awk的脚本文件(或者绝对路径)  要处理的文本文件
./awk.sh filename

./sed.sh filename

Third, the internal awk variable related

variable Variable Description Remark
==$0== All current record processing line
==\$1,\$2,\$3...\$n== Per line in different fields at the symbol interval == == divided awk -F: '{print \$1,\$3}'
==NF== Number of fields in the current record (columns) awk -F: '{print NF}'
==$NF== last row $ (NF-1) represents the inverse of the second column
== FNR / NR == Line number
==FS== Spacer defined 'BEGIN{FS=":"};{print \$1,$3}'
==OFS== Definition of the output field separator, a space default == == 'BEGIN{OFS="\t"};print \$1,$3}'
RS Input record delimiter, the default line feed 'BEGIN{RS="\t"};{print $0}'
ORS Output record delimiter, the default line feed 'BEGIN{ORS="\n\n"};{print \$1,$3}'
FILENAME The current input file name

1, conventional built-in variable == == Example

# awk -F: '{print $1,$(NF-1)}' 1.txt
# awk -F: '{print $1,$(NF-1),$NF,NF}' 1.txt
# awk '/root/{print $0}' 1.txt
# awk '/root/' 1.txt
# awk -F: '/root/{print $1,$NF}' 1.txt 
root /bin/bash
# awk -F: '/root/{print $0}' 1.txt      
root:x:0:0:root:/root:/bin/bash
# awk 'NR==1,NR==5' 1.txt 
# awk 'NR==1,NR==5{print $0}' 1.txt
# awk 'NR==1,NR==5;/^root/{print $0}' 1.txt 
root:x:0:0:root:/root:/bin/bash
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin

2, for example built-in variable delimiter

FS和OFS:
# awk 'BEGIN{FS=":"};/^root/,/^lp/{print $1,$NF}' 1.txt
# awk -F: 'BEGIN{OFS="\t\t"};/^root/,/^lp/{print $1,$NF}' 1.txt        
root            /bin/bash
bin             /sbin/nologin
daemon          /sbin/nologin
adm             /sbin/nologin
lp              /sbin/nologin
# awk -F: 'BEGIN{OFS="@@@"};/^root/,/^lp/{print $1,$NF}' 1.txt     
root@@@/bin/bash
bin@@@/sbin/nologin
daemon@@@/sbin/nologin
adm@@@/sbin/nologin
lp@@@/sbin/nologin
[root@server shell07]# 

RS和ORS:
修改源文件前2行增加制表符和内容:
vim 1.txt
root:x:0:0:root:/root:/bin/bash hello   world
bin:x:1:1:bin:/bin:/sbin/nologin        test1   test2

# awk 'BEGIN{RS="\t"};{print $0}' 1.txt
# awk 'BEGIN{ORS="\t"};{print $0}' 1.txt

Four, awk works

awk -F: '{print $1,$3}' /etc/passwd

  1. awk used as input line, the line and assign internal variable $ 0, each row may also be referred to as a record with a newline character (RS) End

  2. Each character row are spaced ==: == (default spaces or tabs) into a field (or fields), each field is stored in the variable numbered, beginning from $ 1

    Q: How to know awk field separated by spaces it?

    A: Because there is an internal variable == FS == determining field separators. Initially, FS assigned spaces

  3. awk to print using the print function fields, print out the field will be separated by a space == ==, because \ $ 1 \ there is a comma between $ 3. Special comma, which maps to another internal variable, called the output field separator == == OFS, OFS default space

  4. After awk processed line, another line acquired from the file, and stores it in the $ 0, covering the original contents, and the new string is divided into fields and processed. This process will continue until all rows processed

Five, awk use advanced

1. Format Output printandprintf

print函数     类似echo "hello world"
# date |awk '{print "Month: "$2 "\nYear: "$NF}'
# awk -F: '{print "username is: " $1 "\t uid is: "$3}' /etc/passwd

printf函数        类似echo -n
# awk -F: '{printf "%-15s %-10s %-15s\n", $1,$2,$3}'  /etc/passwd
# awk -F: '{printf "|%15s| %10s| %15s|\n", $1,$2,$3}' /etc/passwd
# awk -F: '{printf "|%-15s| %-10s| %-15s|\n", $1,$2,$3}' /etc/passwd

awk 'BEGIN{FS=":"};{printf "%-15s %-15s %-15s\n",$1,$6,$NF}' a.txt

%s 字符类型  strings            %-20s
%d 数值类型 
占15字符
- 表示左对齐,默认是右对齐
printf默认不会在行尾自动换行,加\n

2. awk variable definitions

# awk -v NUM=3 -F: '{ print $NUM }' /etc/passwd
# awk -v NUM=3 -F: '{ print NUM }' /etc/passwd
# awk -v num=1 'BEGIN{print num}' 
1
# awk -v num=1 'BEGIN{print $num}' 
注意:
awk中调用定义的变量不需要加$

3. awk use the BEGIN ... END

① == BEGIN ==: == representation before the start of program execution ==

② == END ==: all of the files dealt == == execution

③ Usage:'BEGIN{开始处理之前};{处理中};END{处理结束后}'

1) 1 illustrates

Print the last one and second last column (login shell and home directory)

awk -F: 'BEGIN{ print "Login_shell\t\tLogin_home\n*******************"};{print $NF"\t\t"$(NF-1)};END{print "************************"}' 1.txt

awk 'BEGIN{ FS=":";print "Login_shell\tLogin_home\n*******************"};{print $NF"\t"$(NF-1)};END{print "************************"}' 1.txt

Login_shell     Login_home
************************
/bin/bash       /root
/sbin/nologin       /bin
/sbin/nologin       /sbin
/sbin/nologin       /var/adm
/sbin/nologin       /var/spool/lpd
/bin/bash       /home/redhat
/bin/bash       /home/user01
/sbin/nologin       /var/named
/bin/bash       /home/u01
/bin/bash       /home/YUNWEI
************************************

2) illustrates 2

Print / etc / passwd in the username, home directory and login shell

u_name      h_dir       shell
***************************

***************************

awk -F: 'BEGIN{OFS="\t\t";print"u_name\t\th_dir\t\tshell\n***************************"};{printf "%-20s %-20s %-20s\n",$1,$(NF-1),$NF};END{print "****************************"}'

# awk -F: 'BEGIN{print "u_name\t\th_dir\t\tshell" RS "*****************"}  {printf "%-15s %-20s %-20s\n",$1,$(NF-1),$NF}END{print "***************************"}'  /etc/passwd

格式化输出:
echo        print
echo -n printf

{printf "%-15s %-20s %-20s\n",$1,$(NF-1),$NF}

4. awk and regular use of the comprehensive

Operators Explanation
== equal
!= not equal to
> more than the
< Less than
>= greater or equal to
<= Less than or equal
~ match
!~ Mismatch
! Logical NOT
&& Logic and
|| Logical or

1) illustrates

从第一行开始匹配到以lp开头行
awk -F: 'NR==1,/^lp/{print $0 }' passwd  
从第一行到第5行          
awk -F: 'NR==1,NR==5{print $0 }' passwd
从以lp开头的行匹配到第10行       
awk -F: '/^lp/,NR==10{print $0 }' passwd 
从以root开头的行匹配到以lp开头的行       
awk -F: '/^root/,/^lp/{print $0}' passwd
打印以root开头或者以lp开头的行            
awk -F: '/^root/ || /^lp/{print $0}' passwd
awk -F: '/^root/;/^lp/{print $0}' passwd
显示5-10行   
awk -F':' 'NR>=5 && NR<=10 {print $0}' /etc/passwd     
awk -F: 'NR<10 && NR>5 {print $0}' passwd 

打印30-39行以bash结尾的内容:
[root@MissHou shell06]# awk 'NR>=30 && NR<=39 && $0 ~ /bash$/{print $0}' passwd 
stu1:x:500:500::/home/stu1:/bin/bash
yunwei:x:501:501::/home/yunwei:/bin/bash
user01:x:502:502::/home/user01:/bin/bash
user02:x:503:503::/home/user02:/bin/bash
user03:x:504:504::/home/user03:/bin/bash

[root@MissHou shell06]# awk 'NR>=3 && NR<=8 && /bash$/' 1.txt  
stu7:x:1007:1007::/rhome/stu7:/bin/bash
stu8:x:1008:1008::/rhome/stu8:/bin/bash
stu9:x:1009:1009::/rhome/stu9:/bin/bash

打印文件中1-5并且以root开头的行
[root@MissHou shell06]# awk 'NR>=1 && NR<=5 && $0 ~ /^root/{print $0}' 1.txt
root:x:0:0:root:/root:/bin/bash
[root@MissHou shell06]# awk 'NR>=1 && NR<=5 && $0 !~ /^root/{print $0}' 1.txt
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin

理解;号和||的含义:
[root@MissHou shell06]# awk 'NR>=3 && NR<=8 || /bash$/' 1.txt
[root@MissHou shell06]# awk 'NR>=3 && NR<=8;/bash$/' 1.txt

打印IP地址
# ifconfig eth0|awk 'NR>1 {print $2}'|awk -F':' 'NR<2 {print $2}'    
# ifconfig eth0|grep Bcast|awk -F':' '{print $2}'|awk '{print $1}'
# ifconfig eth0|grep Bcast|awk '{print $2}'|awk -F: '{print $2}'

# ifconfig eth0|awk NR==2|awk -F '[ :]+' '{print $4RS$6RS$8}'
# ifconfig eth0|awk -F"[ :]+" '/inet addr:/{print $4}'

5. Workshop

  1. The operating system can display user login information matches all 7 from the first column to the end of the bash, the output of the entire line (all the columns of the current row)
[root@MissHou ~] awk '/bash$/{print $0}'    /etc/passwd
[root@MissHou ~] awk '/bash$/{print $0}' /etc/passwd
[root@MissHou ~] awk '/bash$/' /etc/passwd
[root@MissHou ~] awk -F: '$7 ~ /bash/' /etc/passwd
[root@MissHou ~] awk -F: '$NF ~ /bash/' /etc/passwd
[root@MissHou ~] awk -F: '$0 ~ /bash/' /etc/passwd
[root@MissHou ~] awk -F: '$0 ~ /\/bin\/bash/' /etc/passwd
  1. Displays the user name can log into the system
# awk -F: '$0 ~ /\/bin\/bash/{print $1}' /etc/passwd
  1. Print out the system in a normal user's UID and username
500 stu1
501 yunwei
502 user01
503 user02
504 user03

# awk -F: 'BEGIN{print "UID\tUSERNAME"} {if($3>=500 && $3 !=65534 ) {print $3"\t"$1} }' /etc/passwdUID  USERNAME

# awk -F: '{if($3 >= 500 && $3 != 65534) print $1,$3}' a.txt 
redhat 508
user01 509
u01 510
YUNWEI 511

6. awk script programming of

1) flow control statements

① if the structure

if语句:

if [ xxx ];then
xxx
fi

格式:
awk 选项 '正则,地址定位{awk语句}'  文件名

{ if(表达式){语句1;语句2;...}}

awk -F: '{if($3>=500 && $3<=60000) {print $1,$3} }' passwd

# awk -F: '{if($3==0) {print $1"是管理员"} }' passwd 
root是管理员

# awk 'BEGIN{if('$(id -u)'==0) {print "admin"} }'
admin

② if ... else structure

if...else语句:
if [ xxx ];then
    xxxxx

else
    xxx
fi

格式:
{if(表达式){语句;语句;...}else{语句;语句;...}}

awk -F: '{ if($3>=500 && $3 != 65534) {print $1"是普通用户"} else {print $1,"不是普通用户"}}' passwd 

awk 'BEGIN{if( '$(id -u)'>=500 && '$(id -u)' !=65534 ) {print "是普通用户"} else {print "不是普通用户"}}'

③ if ... elif ... else structure

if [xxxx];then
    xxxx
elif [xxx];then
    xxx
....
else
...
fi

if...else if...else语句:

格式:
{ if(表达式1){语句;语句;...}else if(表达式2){语句;语句;...}else if(表达式3){语句;语句;...}else{语句;语句;...}}

awk -F: '{ if($3==0) {print $1,":是管理员"} else if($3>=1 && $3<=499 || $3==65534 ) {print $1,":是系统用户"} else {print $1,":是普通用户"}}'

awk -F: '{ if($3==0) {i++} else if($3>=1 && $3<=499 || $3==65534 ) {j++} else {k++}};END{print "管理员个数为:"i "\n系统用户个数为:"j"\n普通用户的个数为:"k }'

# awk -F: '{if($3==0) {print $1,"is admin"} else if($3>=1 && $3<=499 || $3==65534) {print $1,"is sys users"} else {print $1,"is general user"} }' a.txt 

root is admin
bin is sys users
daemon is sys users
adm is sys users
lp is sys users
redhat is general user
user01 is general user
named is sys users
u01 is general user
YUNWEI is general user

awk -F: '{  if($3==0) {print $1":管理员"} else if($3>=1 && $3<500 || $3==65534 ) {print $1":是系统用户"} else {print $1":是普通用户"}}'   /etc/passwd

awk -F: '{if($3==0) {i++} else if($3>=1 && $3<500 || $3==65534){j++} else {k++}};END{print "管理员个数为:" i RS "系统用户个数为:"j RS "普通用户的个数为:"k }' /etc/passwd
管理员个数为:1
系统用户个数为:28
普通用户的个数为:27

# awk -F: '{ if($3==0) {print $1":是管理员"} else if($3>=500 && $3!=65534) {print $1":是普通用户"} else {print $1":是系统用户"}}' passwd 

awk -F: '{if($3==0){i++} else if($3>=500){k++} else{j++}} END{print i; print k; print j}' /etc/passwd

awk -F: '{if($3==0){i++} else if($3>999){k++} else{j++}} END{print "管理员个数: "i; print "普通用个数: "k; print "系统用户: "j}' /etc/passwd 

如果是普通用户打印默认shell,如果是系统用户打印用户名
# awk -F: '{if($3>=1 && $3<500 || $3 == 65534) {print $1} else if($3>=500 && $3<=60000 ) {print $NF} }' /etc/passwd

2) loop

① for loop

打印1~5
for ((i=1;i<=5;i++));do echo $i;done

# awk 'BEGIN { for(i=1;i<=5;i++) {print i} }'
打印1~10中的奇数
# for ((i=1;i<=10;i+=2));do echo $i;done|awk '{sum+=$0};END{print sum}'
# awk 'BEGIN{ for(i=1;i<=10;i+=2) {print i} }'
# awk 'BEGIN{ for(i=1;i<=10;i+=2) print i }'

计算1-5的和
# awk 'BEGIN{sum=0;for(i=1;i<=5;i++) sum+=i;print sum}'
# awk 'BEGIN{for(i=1;i<=5;i++) (sum+=i);{print sum}}'
# awk 'BEGIN{for(i=1;i<=5;i++) (sum+=i);print sum}'

② while loop

打印1-5
# i=1;while (($i<=5));do echo $i;let i++;done

# awk 'BEGIN { i=1;while(i<=5) {print i;i++} }'
打印1~10中的奇数
# awk 'BEGIN{i=1;while(i<=10) {print i;i+=2} }'
计算1-5的和
# awk 'BEGIN{i=1;sum=0;while(i<=5) {sum+=i;i++}; print sum }'
# awk 'BEGIN {i=1;while(i<=5) {(sum+=i) i++};print sum }'

③ nested loop

嵌套循环:
#!/bin/bash
for ((y=1;y<=5;y++))
do
    for ((x=1;x<=$y;x++))
    do
        echo -n $x  
    done
echo
done

awk 'BEGIN{ for(y=1;y<=5;y++) {for(x=1;x<=y;x++) {printf x} ;print } }'

# awk 'BEGIN { for(y=1;y<=5;y++) { for(x=1;x<=y;x++) {printf x};print} }'
1
12
123
1234
12345

# awk 'BEGIN{ y=1;while(y<=5) { for(x=1;x<=y;x++) {printf x};y++;print}}'
1
12
123
1234
12345

尝试用三种方法打印99口诀表:
#awk 'BEGIN{for(y=1;y<=9;y++) { for(x=1;x<=y;x++) {printf x"*"y"="x*y"\t"};print} }'

#awk 'BEGIN{for(y=1;y<=9;y++) { for(x=1;x<=y;x++) printf x"*"y"="x*y"\t";print} }'
#awk 'BEGIN{i=1;while(i<=9){for(j=1;j<=i;j++) {printf j"*"i"="j*i"\t"};print;i++ }}'

#awk 'BEGIN{for(i=1;i<=9;i++){j=1;while(j<=i) {printf j"*"i"="i*j"\t";j++};print}}'

循环的控制:
break       条件满足的时候中断循环
continue    条件满足的时候跳过循环
# awk 'BEGIN{for(i=1;i<=5;i++) {if(i==3) break;print i} }'
1
2
# awk 'BEGIN{for(i=1;i<=5;i++){if(i==3) continue;print i}}'
1
2
4
5

7. awk arithmetic

+ - * / %(模) ^(幂2^3)
可以在模式中执行计算,awk都将按浮点数方式执行算术运算
# awk 'BEGIN{print 1+1}'
# awk 'BEGIN{print 1**1}'
# awk 'BEGIN{print 2**3}'
# awk 'BEGIN{print 2/3}'

Six, awk statistics Cases

1, various types of statistical system shell

# awk -F: '{ shells[$NF]++ };END{for (i in shells) {print i,shells[i]} }' /etc/passwd

books[linux]++
books[linux]=1
shells[/bin/bash]++
shells[/sbin/nologin]++

/bin/bash 5
/sbin/nologin 6

shells[/bin/bash]++         a
shells[/sbin/nologin]++     b
shells[/sbin/shutdown]++    c

books[linux]++
books[php]++

2, the state statistics site visits

# ss -antp|grep 80|awk '{states[$1]++};END{for(i in states){print i,states[i]}}'
TIME_WAIT 578
ESTABLISHED 1
LISTEN 1

# ss -an |grep :80 |awk '{states[$2]++};END{for(i in states){print i,states[i]}}'
LISTEN 1
ESTAB 5
TIME-WAIT 25

# ss -an |grep :80 |awk '{states[$2]++};END{for(i in states){print i,states[i]}}' |sort -k2 -rn
TIME-WAIT 18
ESTAB 8
LISTEN 1

3, the number of statistics for each IP access to the site

# netstat -ant |grep :80 |awk -F: '{ip_count[$8]++};END{for(i in ip_count){print i,ip_count[i]} }' |sort

# ss -an |grep :80 |awk -F":" '!/LISTEN/{ip_count[$(NF-1)]++};END{for(i in ip_count){print i,ip_count[i]}}' |sort -k2 -rn |head

4, website statistics logs PV amount

统计Apache/Nginx日志中某一天的PV量  <统计日志>
# grep '27/Jul/2017' mysqladmin.cc-access_log |wc -l
14519

统计Apache/Nginx日志中某一天不同IP的访问量 <统计日志>
# grep '27/Jul/2017' mysqladmin.cc-access_log |awk '{ips[$1]++};END{for(i in ips){print i,ips[i]} }' |sort -k2 -rn |head

# grep '07/Aug/2017' access.log |awk '{ips[$1]++};END{for(i in ips){print i,ips[i]} }' |awk '$2>100' |sort -k2 -rn

Glossary:

== Web site page views (PV) ==
noun: PV = PageView (website traffic)
Description: Browse refers to the number of pages, number of pages to measure the sites they visit. Open multiple views on the same page is cumulative. Each user will open a page record 1 PV.

Noun: VV = Visit View (visits)
Description: From the visitors coming to your site to eventually close all pages of the site to leave, counted as 1 visit. If the visitors for 30 minutes without a new open and refresh the page or close the browser visitors, were calculated end-oriented visit.

Unique visitors (UV)
noun: UV = Unique Visitor (unique visitors)
Description: 1 days Visitors same number of visits to your site count only one UV.

Independent IP (IP)
noun: IP = number of independent IP
Note: Use a different IP address refers to the number of users accessing the site within a day. Whether the same IP visited several pages, the number of independent IP are 1

------------ paper so far, thanks for reading --------------

Guess you like

Origin blog.51cto.com/14157628/2472574