shell programming series 14-- Overview awk text processing and Common Methods Three Musketeers awk is a text-processing tool, generally used to process data and generate reports the results awk named its founder Alfred Aho, Peter Weinberger, and the first letter of the last name consisting of Brian Kernighan awk operating mode Syntax The first form: awk ' the BEGIN {} the END pattern {} {} Commands ' file_name BEGIN before matching operation performed on, pattern {commands} is the operation of each line, END match after the operation The second form: Standard Output | awk ' the BEGIN {} the END pattern {} {} Commands ' Syntax description Syntax description BEGIN {} performed before data processing official pattern matching mode {Commands} command processing, multiple lines may END {} after completion of all processing execution hits awk built-in variable Built-in variables table (on) Built-in Variable Meaning $ 0 entire line $ . 1 - $ n-current line of l- n-th field NF number of fields in the current line, that is, how many columns NR current line number, counting from 1 When multiple files FNR processing, each file separately counted line number, starting from 0 are FS input field separator. Not specify a default space or tab key split RS input line separator. Default CRLF OFS output field separator. The default is a space ORS output line separator. The default is a carriage return line feed Built-in variables table (at) Built-in Variable Meaning Enter the name of the current file FILENAME The number of command-line parameters ARGC ARGV array of command line arguments to sum up: Built-in variables: $ 0 print line all the information $ 1 ~ 1 n fields of information to print line $ n The number of field lines treated NF Number Field No. NR Number Row row row treatment When the FNR File Number Row multiple file processing, each individual file number of rows FS Field Separator field separator, default to space or tab split is not specified RS Row Separator line delimiter is not specified with a carriage return linefeed division OFS Output Filed Separator Output Field Separator ORS Output Row Separator output line delimiter FILENAME filename processing of documents The number of command-line parameters ARGC ARGV array of command line arguments Output data of the entire row # [root@localhost shell]# awk '{print $0}' passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown stop: x: 7 : 0 : wait: / sbin: / sbin / halt mail:x:8:12:mail:/var/spool/mail:/sbin/nologin operator:x:11:0:operator:/root:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin nobody:x:99:99:Nobody:/:/sbin/nologin systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin dbus:x:81:81:System message bus:/:/sbin/nologin polkitd:x:999:998:User for polkitd:/:/sbin/nologin sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin ajie: X: 1000 : 1000 : ajie: / Home / ajie: / bin / the bash chrony:x:998:996::/var/lib/chrony:/sbin/nologin deploy:x:1001:1001::/home/deploy:/bin/bash nginx:x:997:995:Nginx web server:/var/lib/nginx:/sbin/nologin # FS specify the delimiter [root@localhost shell]# awk 'BEGIN{FS=":"}{print $1}' passwd root bin daemon adm lp sync shutdown halt mail operator games ftp nobody systemd-network dbus polkitd sshd postfix ajie chrony deploy nginx # Default to space or tab delimiters [root@localhost shell]# cat list Hadoop Spark Flume Java Python Scala Mike Allen Meggie [root@localhost shell]# awk '{print $1}' list Hadoop Java Allen [root@localhost shell]# awk 'BEGIN{FS=" "}{print $1}' list Hadoop Java Allen # NF number of output lines of each field [root@localhost shell]# cat list Hadoop Spark Flume Java Python Scala Golang Mike Allen Meggie [root@localhost shell]# awk '{print NF}' list 3 4 3 Line number # NR, multiple files (List, the passwd , / etc / time fstab) accumulation line number [root@localhost shell]# awk '{print NR}' list passwd /etc/fstab 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 # FNR when processing two or more files will be counted separately [root@localhost shell]# awk '{print FNR}' list passwd /etc/fstab 1 2 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 1 2 3 4 5 6 7 8 9 10 11 12 [root@localhost shell]# cat list Hadoop|Spark:Flume In Java | Python: Scala: Golang Allen | Mike: Meggie # The | symbol to separate columns [root@localhost shell]# awk 'BEGIN{FS="|"}{print $2}' list Spark:Flume Python: Scala: Golang Mike:Meggie # At: notation separate columns [root@localhost shell]# awk 'BEGIN{FS=":"}{print $2}' list Flume Scala Meggie # RS specified line delimiter: - [@ localhost the shell the root] # CAT List Hadoop | Spark | Flume - Java | Python | Scala | Golang - Allen | Mike | Meggie [root@localhost shell]# awk 'BEGIN{RS="--"}{print $0}' list Hadoop|Spark|Flume In Java | Python | Scala | Golang Allen | Mike | Meggie [root@localhost shell]# awk 'BEGIN{RS="--";FS="|"}{print $3}' list Flume Scala Meggie # ORS separator output to & connect the output lines [root@localhost shell]# awk 'BEGIN{RS="--";FS="|";ORS="&"}{print $3}' list Flume&Scala&Meggie # The default field delimiter is a space [root@localhost shell]# awk 'BEGIN{RS="--";FS="|";ORS="&"}{print $1,$3}' list Hadoop Flume&Java Scala&Allen Meggie & # OFS specified field separator: [root@localhost shell]# awk 'BEGIN{RS="--";FS="|";ORS="&";OFS=":"}{print $1,$3}' list Hadoop:Flume&Java:Scala&Allen:Meggie & # FILENAME filename [root@localhost shell]# awk '{print FILENAME}' list list # 3 output file name list, because there is no input matching is the default mode awk-line processing, there are three lines of text, there will be three times three times the processing output [root@localhost shell]# cat list Hadoop | Spark | Flume - Java | Python | Scala | Golang - Allen | Mike | Meggie Test File Line [root@localhost shell]# awk '{print FILENAME}' list list list list # ARGC number of command line parameters [root@localhost shell]# awk '{print ARGC}' list 2 2 2 [root@localhost shell]# awk '{print ARGC}' list /etc/fstab 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 # NF represents the number of fields, of NF = 7 $ of NF is $ 7 is the last field [root@localhost shell]# awk 'BEGIN{FS=":"}{print $NF}' /etc/passwd /bin/bash /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /bin/sync /sbin/shutdown /sbin/halt /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /sbin/nologin /bin/bash /sbin/nologin /bin/bash /sbin/nologin
Reproduced in: https: //www.cnblogs.com/reblue520/p/10984717.html