linux-awk-3

awk

基础语法
Awk –Fs ‘/pattern/ {action}’ input-file
(或者)
Awk –Fs ‘{action}’ input-file

-F is the field delimiter. If not specified, the default will use the space as a delimiter.
/ pattern / and {action} 9-AWk needed in single quotes.
/ pattern / is optional. If not specified, awk will process all the records in the input file. If a specified pattern, awk only recording mode specified matching process.

Awk program structure (BEGIN, body, END) region

BEGIN region
syntax Begin region:
BEGIN {} awk-Commands
command BEGIN region only beginning, executed before awk command execution body region.
BEGIN area very suitable for printing packet header information, and is used to initialize variables.
BEGIN region may have one or more awk command
keywords must use uppercase BEGIN
BEGIN field is optional
BODY region
/ pattern / {action}
read each row, one line

END field
END {awk-commands} performed only once

    awk  -F ":"  '/^root/{print }' passwd 
    

Built-in variables
awk 'the BEGIN the FS = { ","} {Print $ 2, $}. 3' employee.txt
awk 'the BEGIN {Print "test1", "test2"}'
without using the comma, awk will not use the OFS, the output without any spaces between the variables

$ Gawk 'BEGIN {print "Hello World!"} {Print $ 0} END {print "byebye"}' data1
built-in variable
$ 0 entire record
$ recording the first data field
$ 2 record in the second data fields
$ n the n-th record data fields
FIELDWIDTHS spaces separated by a number, defined specifically for each field width
FS input field separator
RS input record separator
OFS output field separator
ORS output field separator

    ARGC 当前命令行参数个数
    ARGIND 当前文件在ARGV中的索引
    ARGV 包含命令行参数的数组
    CONVFMT 数字的转换格式(参见printf语句),默认值为%.6g
    ENVIRON 由当前shell环境变量及其值组成的关联数组
    ERRNO 当读取或关闭输入文件发生错误时的系统错误号
    FILENAME 用作gawk输入的数据文件的文件名
    FNR 当前数据文件中的记录数
    IGNORECASE 设成非零时,忽略gawk命令中出现的字符串的字符大小写
    NF 数据文件中的字段总数
    NR 已处理的输入记录数
    FNR  文件记录数
    OFMT 数字的输出格式,默认值为%.6g
    RLENGTH 由match函数所匹配的子串的长度
    RSTART 由match函数所匹配的子串的起始位置
    
   示例:
   命令行参数个数
   awk '{print ARGC}’ /etc/fstab /etc/inittab
   命令行各参数
   awk ‘BEGIN {print ARGV[0]}’ /etc/fstab /etc/inittab
   

awk '{print FILENAME, "record number is",NR,"FNR IS" ,FNR }' awk passwd

variable

Awk variables begin with a letter, and subsequent characters can be numbers, letters, or underscores. Keywords can not be used as a variable awk
awk variable can be used directly without prior notice. If you want to initialize a variable, as in the best BEGIN area, it will only be executed once.

Custom variables
-v or directly defined

  printf  格式化输出
  格式化输出:printf “FORMAT”, item1, item2, .
   (1) 必须指定FORMAT
   (2) 不会自动换行,需要显式给出换行控制符,\n
   (3) FORMAT中需要分别为后面每个item指定格式符
    
    
        

一元操作符
操作符         描述
+                  取正,数字本身返回 
-                   取反
++
 -- 
 
 算术操作符
 
 操作符        描述
 +
 -
 *
 /
 %
 
 awk  'NR%2 == 0 {print  NR,$0}' passwd
 
 字符串操作符
 
    赋值操作符
    操作符             描述 
    =
    +=
    -=
    *=
    /=
    %=
          
    比较操作符
    
    >
    >=
    <
    <=
    ==
    !=
    &&            且
    ||              或
    
    
    正则表达式
    
    操作符             描述
    ~                       匹配
    !~                     不
    
   awk  -F:  '$1~"ro"'  passwd        第一个字段包含ro

    $ awk  'BEGIN   { FS=":";print  "begin  test"   }   {print  $1}   END  {print   "itis  end "}  '   passwd

Match operator
$ 1 ~ / ^ data /

gawk -F: '$4 == 0{print $1}' /etc/passwd

行范围
awk -F: ‘/^root\>/,/^nobody\>/ {print $1}'  /etc/passwd
awk -F: ‘(NR>=10&<=20){print NR,$1}' /etc/passwd  (小括号加不加都行) 
   


awk结构化命令
   if  
   单条语句
   if(conditional-expression )   {statements ;.......}
  
   多条       
    if (conditional-expression)
    {
        action1;  #依次执行
        action2;
    }
       
    if   else  
    if (conditional-expression)
        action1
    else
        action2
        
        if(condition) {statements;…} else {statements;…}
        
        
     三元操作符
     codintional-expression ? action1 : action2 ;
     
     while  
     while (codition)
            {                   
               
               Actions
    
              }


            while(conditon) {statments;…}    
            
      do-while  

      do
    {           
        action
      }        
     while(condition)

        for 
        
        for(initialization;condition;increment/decrement)
            for(expr1;expr2;expr3) {statements;…}

            if-then-else语句:
         if (condition) statement1; else statement2
        while语句:
         while (condition)
         {
         statements
         }
        do-while语句:
         do {
         statements
         } while (condition)
        for语句:
         for(variable assignment; condition; iteration process) 
   示例
  seq  10 | awk  'i=0{print  $0}'     i=0不打印
  seq  10 | awk  'i=1{print  $0}'     =1  打印  与大括号无关
  seq  10 | awk  'i=!i{print i, $0}'   开始i未赋值,!i 为真(即1),打印,之后为假(0),不打印,只打印奇数行
  seq  10 | awk  '!(i=!i){print i, $0}'  同上,打印偶数行
  
  取磁盘利用率并显示
   df  -h  | awk  -F "[[:space:]]+|%"  '/^\/dev\/sd/{  if ($5>10)  print  $1, $5}'
  
   awk  '/^[[:space:]]*linux16/  {i=1;while (i<= NF) {print $i,length($i);i++} }'  /boot/grub2/grub.cfg
   
   for
   
   for(variable assignment;condition;iteration process)
      {for-body}
   
    awk  'BEGIN{wkd["mo"]="monday";wkd["fr"]="friday";wkd["sat"]="satday" ; for( i  in wkd ){ print i,wkd[i]}}'       
   
   
   awk  'BEGIN{sum=0; for (i=1;i<=100;i++){ sum+=i}  print sum  }'
   
    
    next:
    提前结束对本行处理而直接进入下一行处理(awk自身的循环)
    
    
    数组
    
    array[index-expression]
     index-expression:
(1) 可使用任意字符串;字符串要使用双引号括起来
(2) 如果某数组元素事先不存在,在引用时,awk会自动创建此元素,并将其值初始化为“空串”
(3) 若要判断数组中是否存在某元素,要使用“index in array”格式进行遍历

Check the number of states
netstat -tan | awk '/ ^ tcp / {state [$ NF] ++} END {for (i in state) {print i, state [i]}}'

access.log  取前十ip   ,并加入防火墙

awk  '{ip[$1]++} END{for (i in  ip ){print  i, "连接数 " ip[i]}} '  access_log   | sort  -nr  -k 3   | head 

 加入iptables防火墙
 
 iptables -A INPUT -s IP -j REJECT
    


本机连接的ip 取前十

 awk '{split($5,ip,":");count[ip[1]]++;print ip[1],"链接数" , count[ip[1]]}'  ss.log |  sort  -nr  -k 3 | head 

 awk   -F "[[:space:]]+|:"  '{ ip[$6]++}END{for(i in ip) { print "summery", i,"links ",  ip[i] } }   '  ss.log  | sort  -nr  -k4     


取日志里ip ,以数字开头的,    
 awk   '/^[0-9]/ {ip[$1]++ }  END{for (i  in ip )  print  i,ip[i]  } '  aess_log
 
 连接数大于100添加至防火墙
    while  true  ;  do 

awk '/^[0-9]/ {ip[$1]++ } END{for (i in ip ) { if (ip[i]>100) print i} } ' access_log | while read line ;do echo " $line" ; done
sleep 10
done

do    iptables -A INPUT -s $line  -j  REJECT
 
 

Taking the random number
awk 'BEGIN {srand (); for (i = 1; i <= 10; i ++) {print rand ()}}'

字符串操作
• length([s]):返回指定字符串的长度
• sub(r,s,[t]):对t字符串搜索r表示模式匹配的内容,并将第一个匹配内容替换为s
echo "2008:08:08 08:08:08" | awk 'gsub(/:/,"-",$0)'

作业:
           
        1 blog.magedu.com
        2 www.magedu.com
        3 hhhh.magedu.com
        4 dddd.magedu.com
        5 b333.magedu.com
        6 bkkk.magedu.com
        7 ssss.magedu.com
        8 wog.magedu.com
        9 ulog.magedu.com
        取主机名
        awk   -F "[ .]"   '{print $2}'  soho.txt   :确定分隔符后取域
        
        
        取fstab 文件系统类型 出现次数
        awk  '/^UUID/{fs[$3]++} END{for (i in fs) {print i,fs[i] }  } '  fstab
        
        fstab 单词出现次数
        grep -wEo  "[[:alpha:]]+"   fstab   |  awk  '{word[$1]++}  END{for (i  in  word)   {print  i,word[i]  }   } '
        
        提取数字
        echo  "Yd$C@M05MB%9&Bdh7dq+YVixp3vpw"   |  awk  'gsub(/[^[:digit:]]/," ",$0 ) '
         
         产生随机数
         awk  'BEGIN{srand(); for (i=1;i<=200;i++)  { if (i==200 ) {printf "%d", int(rand()*100) ;}else   {printf "%d,", int(rand()*100) }} }'
         
         取如上随机数最大最小  
           awk  -F ","    ' { MAX=$1;MIN=$1;  for (i=1;i<=NF;i++) {if  ( $i>= MAX  ) { MAX=$i } ;  if ( $i <= MIN) {  MIN=$i }  }  }  END{ print  "MAX=",MAX, "MIN=" ,MIN } '  soho.txt
      
      
      
        http://mail.magedu.com/index.html
        http://www.magedu.com/test.html
        http://study.magedu.com/index.html
        http://blog.magedu.com/index.html
        http://www.magedu.com/images/logo.jpg

       取完全限定域名
       确定分割符,选择域,计数打印
      awk   -F"/"  '{FQ[$3]++} END{ for(i  in  FQ )   print  i,FQ[i] }'  soho.txt   | sort  -rn  -k 2
     
     
 例题:
     
     inode|beginnumber|endnumber|counts|
    106|3363120000|3363129999|10000|
    106|3368560000|3368579999|20000|
    310|3337000000|3337000100|101|
    310|3342950000|3342959999|10000|
    310|3362120960|3362120961|2|
    311|3313460102|3313469999|9898|
    311|3313470000|3313499999|30000|
    311|3362120962|3362120963|2|

    输出格式        
    310|3337000000|3362120961|10103|
    311|3313460102|3362120963|39900|
    106|3363120000|3368579999|30000|
    
     
     awk -F'|' -v OFS='|' '/^[0-9]/{inode[$1]++; if(!bn[$1]){bn[$1]=$2} else  if(bn[$1]>$2){bn[$1]=$2}; if(en[$1]<$3)en[$1]=$3;cnt[$1]+=$(NF-1)} E{for(i in inode)print i,bn[i],en[i],cnt[i]}'   soho.txt


用awk命令,计算一个目录下文件大小的总和
find  .     -maxdepth 1   -type  f -ls   | awk  '{sum+=$7}  END {print  sum} '


   统计链接到本地数最大的IP10个
   
     netstat  -an  | head   | awk  -F "[[:space:]]+|:"   ' NR> 2 {print $6}'
    
     netstat  -an  | head   | awk  -F "[[:space:]]+|:"   ' NR> 2 {ip[$6]++}  END{for (i in ip ) print i,ip[i] }' | sort -nr -k 2|head

Guess you like

Origin www.cnblogs.com/g2thend/p/11621029.html