Awk study notes 1-commonly used awk operations

(1) Introduction to awk


        ​ ​ ​ awk of the three musketeers of linux, AWK is a language for processing text files and a powerful text analysis tool. Grep, sed and awk all read and process one line until the processing is completed.

grep: filter text
sed: modify text
awk: process text (mainly used for formatting, output text in the specified format)

(2) awk syntax

常见的三种形式
awk [选项] '[pattern] action' 处理对象

标准输出 | awk [选项] '[pattern] action'

awk [选项] 'BEGIN{ commands } /pattern/ { commands } END{ commands }' 处理对象

        ​ ​ 1.Commonly used options    

-F fs or --field-separator fs
Specifies the input file fold separator, fs is a string or a regular expression, such as -F:.
-v var=value or --asign var=value
Assign a user-defined variable.
-f scripfile or --file scriptfile
Read awk commands from script files

        ​ ​ 2. Commonly used patterns  

This is a regular expression used to match lines in the file //
    ^match the beginning of the line
    $match the end of the line     [ ^] matches any character not enclosed in square brackets     [] Match any character in the square brackets     ? Match the previous character 0 or 1 times     +Match the previous character 1 or more times Multiple times     *Match the previous character 0 or more times
    .Match any 1 character





        ​ ​ 3. Common actions

This is a command sequence to process common lines
    print prints matching lines or specified fields
    printf formatted output pay attention to this No line breaks
                    printf: Formatted printing
                    %s: String
                    %d : Number< a i=6>                                                                                                                                                                                                                                                                                      Occupies 15 characters     if-else conditional judgment     for loop processing     while loop processing     getline processes a line from the file     next jump Exit the current line     exit ends the awk command









        ​ ​ 4. Processing objects


        This is the text we want to process, nothing special

        5.BEGIN and END in awk statement

        There can be multiple pattern {action} in awk, which only need to be used directly. BEGIN and END here are a special form of pattern. The action carried by BEGIN is to be executed before starting to process text data, and the action carried by END The action is executed after the text data is processed.

(3) awk built-in variables

NF     当前行的字段数量
NR     当前行的行号,从1开始
FNR    各文件分别计数的行号,输入多个文件时,NR越来越大,FNR在换到另一个文件里就从1开始了,很好理解吧
$0     完整的输入记录,也就是当前行的全部内容
$n     当前记录的第n个字段,字段间由FS分隔
FS     字段分隔符(默认是任何空格)
RS     记录分隔符(默认是一个换行符)
OFS    输出字段分隔符,默认值与输入字段分隔符一致。
ORS    输出记录分隔符(默认值是一个换行符)
FILENAME      当前文件名
IGNORECASE    如果为真,则进行忽略大小写的匹配


(4) awk execution process

BEGIN{} : the first execution

 //: Palate
{}: Circular body
end {}:


There is at least one and at most four here
        1. Execute the content of the BEGIN block through the keyword BEGIN, that is, the content of the curly braces {} after BEGIN.
        2. Complete the execution of the BEGIN block and start executing the body block.
        3. Read records separated by \n newlines.
        4. Divide the record into fields according to the specified field separator and fill in the fields. $0 represents all fields (i.e. one line of content), $1 represents the first field, and $n represents the nth field. .
        5. Execute each BODY block in sequence. The content of awk-commands will be executed only after the pattern part matches the content of the line successfully.
        6. Read and execute each line in a loop until the end of the file, completing the execution of the body block.
        7. Start the END block execution, and the END block can output the final result.

(5) Common operations


        ​ ​ 1. Intercept a certain paragraph in the document

-- F指定域分割符号,输出第一列和第二列
cat /etc/passwd |awk -F ':' '{print $1,$2}'


        ​ ​ 2. Match characters or strings

--print也可以不用直接用,默认输出行
cat /etc/passwd |awk '/bash/'
-- 只匹配有/bin/bash的行的用户,nologin的用户不显示
cat /etc/passwd |awk -F ':' '/\/bin\/bash/ {print $1}'
-- 匹配以n结尾,包含ftp文本的用户, 打印的是后可以用双引号打印自己想要的数
cat /etc/passwd |awk -F ':' '/.*ftp.*n$/ {print "用户:" $1}'


        ​ ​ 3.Conditional operator

        Operator Description

                ~: Use regular matching
                ! ~: Do not match regular match

                || logic or
                & amp; & amp; logic and
                & lt; & lt; = & gt; & gt; = == Relationship operator
                Space connection
                + - Addition, subtraction
                * / % Multiplication, division and remainder
                .

The parameters of the print function can be variables, numbers or strings. Strings must be quoted with double quotes and arguments separated by commas. Without the comma, the parameters are concatenated and indistinguishable. Here, the comma has the same effect as the output file delimiter, except that the latter is a space.

The printf function is basically similar to printf in C language. It can format strings. When the output is complex, printf is easier to use and the code is easier to understand.

--匹配uid>1000的用户全部信息
cat /etc/passwd |awk -F ':' '$3>=1000'
--匹配test开头,输出当前文件名,行号,uid+1000
awk -F ':' '/^test/ {print FILENAME,NR,$3+1000}' /etc/passwd
--使用正则表达式,正则表达式匹配test  
-- ~ 表示模式开始。// 里面是模式。 !~表示模式取反
awk -F ':' '$1 ~ /test/ {print $1}' /etc/passwd


        ​ ​ 4.Set variables

-- 输出test用户 组名+2000
awk -F ':' -va=2000 '/^test/ {print $1,$3+a}' /etc/passwd


        5.Ignore case

cat /etc/passwd | awk 'BEGIN{IGNORECASE=1} /ROOT/ {print $1}' 

        ​ ​ 6. Eliminate blank lines and display them

cat 1.txt | awk '!/^$/'


 

Guess you like

Origin blog.csdn.net/qq_63693805/article/details/134054780
awk