AWK Programming Language Notes Chapter 2: Syntax Structure

1 Prepare data
$ vi countries
$ cat countries
ussr 8689 275 asia
canada 3852 25 north america
china 3705 1032 asia
usa 3615 237 north america
brazil 3286 134 sourth america
india 1267 746 asia
mexico 762 78 north america
france 211 55 europe
japan 144 120 asia
germany 96 61 europe
england 94 56 europe
2 modes

Insert image description here
BEGIN is executed before awk reads, and END is executed after awk reads.

Insert image description here

2.1 Comparison operators

Insert image description here

2.2 String matching

Insert image description here

2.3 Regular expressions

Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here

2.4 Escape characters

Insert image description here

2.5 Composite mode

Insert image description here

$ awk '$4 ~ /^(asia|europe)$/ {print $0}' countries
ussr 8689 275 asia
china 3705 1032 asia
...

$ awk '$4 ~ /asia/ || /europe/ {print $0}' countries
ussr 8689 275 asia
...
england 94 56 europe

$ awk '$4 ~ /asia|europe/ {print $0}' countries
ussr 8689 275 asia
...
england 94 56 europe

$ awk '$4 !~ /asia|europe/ {print $0}' countries
canada 3852 25 north america
...
mexico 762 78 north america
2.6 Range Patterns

The range pattern matches every line between pat1 and pat2; pat2 can match the same line as pat1, making the range a single line.

The match begins whenever the first pattern in the range matches; if no subsequent instance of the second pattern is found, all lines up to the end of the input are matched.

/Canada/, /USA/
$ awk '$4 ~ /europe/,/africa/ {print $0}' countries
france 211 55 europe
japan 144 120 asia
germany 96 61 europe
england 94 56 europe

# 第一行至第2行
$ awk 'NR==1,NR==2 {print $0}' countries
ussr 8689 275 asia
canada 3852 25 north america

# 第一行至第五行
$ awk 'FNR==1,FNR==5 {print FILENAME ":" $0}' countries
countries:ussr 8689 275 asia
countries:canada 3852 25 north america
countries:china 3705 1032 asia
countries:usa 3615 237 north america
countries:brazil 3286 134 sourth america
3 actions

Insert image description here

3.1 Built-in variables

Insert image description here

3.2 Mathematical operators

Insert image description here

$ awk '{print ($2 >200 ? $2:"$2 less than 200" NR)}' countries
8689
3852
3705
3615
3286
1267
762
211
$2 less than 2009
$2 less than 20010
$2 less than 20011

$ awk '$4=="asia" {pop=pop+$3;n=n+1}                                                   
END {print "total population of the",n,
    "asian countries is " ,pop,"million."}' countries
total population of the 4 asian countries is  2173 million.

$ awk '$3>maxpop {maxpop=$3;country=$1}
> END {print "contry with largest population:", country,maxpop}' countries
contry with largest population: china 1032
3.3 Built-in mathematical functions

Insert image description here

# 字符串作为表达式
$ awk 'BEGIN {digits="^[0-9]+$"}
$2 ~ digits {print $0}' countries
ussr 8689 275 asia
canada 3852 25 north america
...

$ awk 'BEGIN {
}
sign = "[+-]?"
decimal= "[0-9]+[.]?[0-9]*"
fraction= "[.][0-9]+"
exponent= "([eEl" sign "[0-9]+)?"
number= nAn sign"(" decimal "I" fraction ")" exponent "$"
$0 ~ number' countries

Among them, / /and " "are equivalent in regular expressions, such as:

$0 - /(,+l-)[0-9]+/
$0 - "(,\+l-)[0-9]+"
3.4 Built-in character functions

Insert image description here
Insert image description here
Insert image description here

3.5 Control flow

Insert image description here

3.6 Output function

Insert image description here
Insert image description here
Insert image description here

Guess you like

Origin blog.csdn.net/mengjizhiyou/article/details/127436328