[2] Learning awk in practice

These awk examples are all encountered in my work, using awk can achieve a multiplier effect. Maybe you can refer to it.

1. Remove the leading and trailing spaces of each line and number them

The file format is as follows:

   o     ne
  two
 three
four
five
six
seven
eight
nine
  ten

Use to awk '{print "[",$0,"]"}' one.txtview, some lines have spaces:

[    o     ne    ]
[   two ]
[  three   ]
[ four ]
[ five ]
[ six ]
[ seven ]
[ eight ]
[ nine ]
[   ten ]

Number each line and concatenate commas, and remove spaces and tabs, such as "one" becomes "1,one".

# 首先将空格给替换了,
# ^[ \t]+           匹配行首一个或多个空格
# [ \t]+$           匹配行末一个或多个空格
# ^[ \t]+|[ \t]+$   同时匹配行首或者行末的空格
awk '{gsub(/^[ \t]+|[ \t]+$/,"");print $0}' one.txt
# 使用内建变量 OFS 和 NR 打印逗号和行号
awk -v OFS=',' '{gsub(/^[ \t]+|[ \t]+$/,"");print NR,$0}' one.txt > one_re.txt

The final output:
Insert picture description here

2. Use a comma to concatenate the numeric id of each line in the file and form a line

The file content format is as follows:

124
432
3252
4634
654
76
3453
57546
3453
45645
34535
3463
3453
456345
# printf 类似于 C语言printf函数
awk '{printf "%s,",$1}' ids.txt

The results are as follows:

124,432,3252,4634,654,76,3453,57546,3453,45645,34535,3463,3453,456345,%

3. Compare the two files to find the diff data in the two files

File 1: region_prod.txt is as follows:

"北京",
"上海",
"重庆",
"天津",
"三沙",
"长沙",
"四川"

File 2: region_test.txt is as follows:

"上海",
"重庆",
"四川",
"北京",
"天津"

Execute the following awk code, if the data in region_prod.txt does not exist in region_test.txt, it will be printed.

 awk -F, 'NR==FNR{a[$1];next}!($1 in a)' region_test.txt region_prod.txt

Output result:

"三沙",
"长沙",

Exchange the positions of two files and vice versa:

 awk -F, 'NR==FNR{a[$1];next}!($1 in a)' region_prod.txt region_test.txt 

The specific details, refer to: the use in the examples of awk
how awk the 'next' command

4. Match the lines with numbers in the file and print out the line numbers where numbers appear in the first 10 lines

file format:

    dfdddgds,
    1231,
   -2352,
fgd,
dfgfd,
 awk -F, '{if($1 ~ /^[ ]*-?[0-9]+$/) print NR}' main.log | head -n 10

-F,: Delimiter specified column of each line is ,
if($1 ~ /^[ ]*-?[0-9]+$/): a first recording determination $1whether matching and regular expression
~: match operator, an expression for record or field matches
/^[ ]*-?[0-9]+$/: positive line numbers match expression
NR: has been read Number of records out, line number

Guess you like

Origin blog.csdn.net/jiaobuchong/article/details/105339062