Overview awk shell programming series 14-- Three Musketeers and common text processing Methods

shell programming series 14-- Overview awk text processing and Common Methods Three Musketeers


awk is a text-processing tool, generally used to process data and generate reports the results
awk named its founder Alfred Aho, Peter Weinberger, and the first letter of the last name consisting of Brian Kernighan

awk operating mode

Syntax

The first form: awk  ' the BEGIN {} the END pattern {} {} Commands ' file_name

BEGIN before matching operation performed on, pattern {commands} is the operation of each line, END match after the operation

The second form: Standard Output | awk  ' the BEGIN {} the END pattern {} {} Commands '


Syntax description

Syntax description
BEGIN {} performed before data processing official
pattern matching mode
{Commands} command processing, multiple lines may
END {} after completion of all processing execution hits

awk built-in variable

Built-in variables table (on)

Built-in Variable Meaning
$ 0         entire line
$ . 1 - $ n-current line of l- n-th field
NF number of fields in the current line, that is, how many columns
NR current line number, counting from 1
When multiple files FNR processing, each file separately counted line number, starting from 0 are
FS input field separator. Not specify a default space or tab key split
RS input line separator. Default CRLF
OFS output field separator. The default is a space
ORS output line separator. The default is a carriage return line feed

Built-in variables table (at)

Built-in Variable Meaning
Enter the name of the current file FILENAME
The number of command-line parameters ARGC
ARGV array of command line arguments

to sum up:
    Built-in variables:
        $ 0                     print line all the information
        $ 1 ~ 1 n fields of information to print line $ n
        The number of field lines treated NF Number Field
        No. NR Number Row row row treatment
        When the FNR File Number Row multiple file processing, each individual file number of rows
        FS Field Separator field separator, default to space or tab split is not specified
        RS Row Separator line delimiter is not specified with a carriage return linefeed division
        OFS Output Filed Separator Output Field Separator
        ORS Output Row Separator output line delimiter
        FILENAME filename processing of documents
        The number of command-line parameters ARGC
        ARGV array of command line arguments


Output data of the entire row #

[root@localhost shell]# awk '{print $0}' passwd 
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
stop: x: 7 : 0 : wait: / sbin: / sbin / halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
polkitd:x:999:998:User for polkitd:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
ajie: X: 1000 : 1000 : ajie: / Home / ajie: / bin / the bash
chrony:x:998:996::/var/lib/chrony:/sbin/nologin
deploy:x:1001:1001::/home/deploy:/bin/bash
nginx:x:997:995:Nginx web server:/var/lib/nginx:/sbin/nologin

# FS specify the delimiter
[root@localhost shell]# awk 'BEGIN{FS=":"}{print $1}' passwd 
root
bin
daemon
adm
lp
sync
shutdown
halt
mail
operator
games
ftp
nobody
systemd-network
dbus
polkitd
sshd
postfix
ajie
chrony
deploy
nginx

# Default to space or tab delimiters
[root@localhost shell]# cat list
Hadoop Spark Flume
Java Python Scala
Mike Allen Meggie
[root@localhost shell]# awk '{print $1}' list
Hadoop
Java
Allen
[root@localhost shell]# awk 'BEGIN{FS=" "}{print $1}' list
Hadoop
Java
Allen

# NF number of output lines of each field
[root@localhost shell]# cat list
Hadoop Spark Flume
Java Python Scala Golang
Mike Allen Meggie
[root@localhost shell]# awk '{print NF}' list
3
4
3

Line number # NR, multiple files (List, the passwd , / etc / time fstab) accumulation line number
[root@localhost shell]# awk '{print NR}' list passwd /etc/fstab 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

# FNR when processing two or more files will be counted separately
[root@localhost shell]# awk '{print FNR}' list passwd /etc/fstab 
1
2
3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
1
2
3
4
5
6
7
8
9
10
11
12


[root@localhost shell]# cat list
Hadoop|Spark:Flume
In Java | Python: Scala: Golang
Allen | Mike: Meggie

# The | symbol to separate columns
[root@localhost shell]# awk 'BEGIN{FS="|"}{print $2}' list
Spark:Flume
Python: Scala: Golang
Mike:Meggie
# At: notation separate columns
[root@localhost shell]# awk 'BEGIN{FS=":"}{print $2}' list
Flume
Scala
Meggie

# RS specified line delimiter: - 
[@ localhost the shell the root] # CAT List
Hadoop | Spark | Flume - Java | Python | Scala | Golang - Allen | Mike | Meggie
[root@localhost shell]# awk 'BEGIN{RS="--"}{print $0}' list
Hadoop|Spark|Flume
In Java | Python | Scala | Golang
Allen | Mike | Meggie

[root@localhost shell]# awk 'BEGIN{RS="--";FS="|"}{print $3}' list 
Flume
Scala
Meggie

# ORS separator output to & connect the output lines
[root@localhost shell]# awk 'BEGIN{RS="--";FS="|";ORS="&"}{print $3}' list
Flume&Scala&Meggie

# The default field delimiter is a space
[root@localhost shell]# awk 'BEGIN{RS="--";FS="|";ORS="&"}{print $1,$3}' list
Hadoop Flume&Java Scala&Allen Meggie
&

# OFS specified field separator: 
[root@localhost shell]# awk 'BEGIN{RS="--";FS="|";ORS="&";OFS=":"}{print $1,$3}' list
Hadoop:Flume&Java:Scala&Allen:Meggie
&

# FILENAME filename
[root@localhost shell]# awk '{print FILENAME}' list
list

# 3 output file name list, because there is no input matching is the default mode awk-line processing, there are three lines of text, there will be three times three times the processing output
[root@localhost shell]# cat list
Hadoop | Spark | Flume - Java | Python | Scala | Golang - Allen | Mike | Meggie
Test File
Line
[root@localhost shell]# awk '{print FILENAME}' list
list
list
list

# ARGC number of command line parameters
[root@localhost shell]# awk '{print ARGC}' list
2
2
2
[root@localhost shell]# awk '{print ARGC}' list /etc/fstab 
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

# NF represents the number of fields, of NF = 7 $ of NF is $ 7 is the last field
[root@localhost shell]# awk 'BEGIN{FS=":"}{print $NF}' /etc/passwd 
/bin/bash
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/bin/sync
/sbin/shutdown
/sbin/halt
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
/bin/bash
/sbin/nologin
/bin/bash
/sbin/nologin

 

Reproduced in: https: //www.cnblogs.com/reblue520/p/10984717.html

Guess you like

Origin blog.csdn.net/weixin_34290096/article/details/93297368