awk to use

awk

Use
awk '{pattern + action}' {filenames}
while operations can be complex, but always grammar, where pattern representing the content in the data look AWK, the action is executed when a match is found in a series of command. Curly braces ({}) need not always appear in the program, they are used to group a series of instructions according to a particular pattern. pattern is a positive expression to be represented, with slash marks.

The most basic function of awk language is a browser-based rules specified in the document or string and extract information, the awk to extract information in order to carry out other text manipulation. Complete awk scripts are often used information in a formatted text file.

Usually, awk is a behavior file processing units. awk each line of the document received, and executing the first command to process text.

1. command line
awk [-F field-separator] ' commands' input-file (s)
where, commands are really awk command, [- F field delimiter] is optional. input-file (s) is a file to be processed.
In awk, each line file, separated by a field separator for each called a domain. Typically, in the case where the unnamed -F field delimiter, the default field separator is a space.

2.shell script way
all the awk command to insert a file, and the executable program awk, awk command interpreter and then as the first line of the script, again invoked by typing the name of the script.
Equivalent to shell script the first line:! # / Bin / sh
can be replaced with:! # / Bin / awk

3. all the awk command to insert a separate file, and then call:
awk -f awk-script-file the INPUT-File (S)
which, -f option to load awk-script-file in the awk script, input-file (s ) with the above it is the same.

Assuming that the output of the last -n 5 following
last -n 5 <== extracting only the first five lines
the root PTS / 192.168.1.100. 1 Tue On Feb 10 11:21 Still logged in
the root PTS / Tue On Feb 10 00:46 192.168.1.100. 1 - 02 : 28 (01:41)
the root PTS / Mon On Feb. 9 192.168.1.100. 1 11:41 - 18:30 (06:48)
dmtsai PTS / Mon On Feb. 9 192.168.1.100. 1 11:41 - 11:41 (00:00 )
root tty1 Fri Sep 5 14:09 - 14:10 (00:01)

If only display the last login account five
Last -n. 5 | awk 'Print $ {}. 1'
the root
the root
the root
dmtsai
the root
awk workflow is such that: there is a read record division newline '\ n', then the recording specified field delimiter into domain, fill-in fields, $ 0 indicates all domains, 1 table Show The first One More area , 1 represents a first field, n represents the n-th field. The default field separator is a "key blank" or "[Tab] button", the user logged represents $ 1, $ 3 represents a login user IP, and so on.
CAT / etc / the passwd | awk -F ':' 'Print $ {}. 1'
the root
daemon
bin
SYS

cat /etc/passwd |awk -F ‘:’ ‘{print $1"\t"$7}’
root /bin/bash
daemon /bin/sh
bin /bin/sh
sys /bin/sh

If only the display / etc / passwd accounts and accounts corresponding shell, and between accounts and shell separated by commas, and add the column names in the name of all lines, shell, add "blue, / bin / nosh" in the last line.
CAT / etc / the passwd | awk -F ':' 'the BEGIN {Print "name, the shell"} {Print $. 1 "," $. 7} the END {Print "Blue, / bin / NOSH"}'
name, the shell
the root, / bin / bash
daemon, / bin / SH
bin, / bin / SH
SYS, / bin / SH
...
Blue, / bin / NOSH

awk workflow is such that: first beging performed, and then read the file, there is a read record / n newline character segmentation and the records in the specified field delimiter into domain, fill-in fields, $ 0 indicates all domains, 1 table Show The first One More area , 1 represents a first field, n represents the n-th field, and then begin an operation mode corresponding to the action. Then start reading the second record ······ until all records have been read, and finally an END operation.

Search / etc / passwd root key of all rows have
awk -F: '/ root /' / etc / passwd
root❌0: 0: root: / root: / bin / bash

This is an example of using a pattern, the pattern matching (here root) row will be performed action (Action is not specified, the default output the contents of each row).

Search supports regular, such as beginning to find the root: awk -F: '/ ^ root /' / etc / passwd

Search / etc / passwd has all the rows of the root key, and displays the corresponding the shell
awk -F: '/ root / Print $ {}. 7' / etc / passwd
/ bin / the bash

This specifies the action {print $ 7}

awk built-in variable
awk has many built-in variables to set the environment information, these variables can be changed, the most commonly used are given below variables

Command line argument number ARGC
ARGV command line parameters are arranged
ENVIRON supports the system environment variables queue
FILENAME awk browse the file name
record number of the file browsing FNR
FS setting input field separator, which is equivalent to the command line option -F
NF History number of fields
recording the number of read NR
OFS output field separator
ORS output record separator
RS control record delimiter

Further, the variable $ 0 refers to the entire record. $ 1 represents the first field of the current line, $ 2 represents the second field of the current line, ... and so on.

Statistics / etc / passwd: file name, line number per row, the number of columns per row, the corresponding row of content integrity:
awk -F ':' '{Print "filename:" FILENAME ", LineNumber:" NR ", Columns : "of NF", linecontent: "$ 0} '/ etc / the passwd
filename: / etc / the passwd, LineNumber:. 1, Columns:. 7, linecontent: root❌0: 0: the root: / the root: / bin / the bash
filename: / etc / the passwd, LineNumber: 2, Columns:. 7, linecontent: daemon❌1:. 1: daemon: / usr / sbin: / bin / SH
filename: / etc / the passwd, LineNumber:. 3, Columns:. 7, linecontent: bin❌ 2: 2: bin: / bin: / bin / SH
filename: / etc / the passwd, LineNumber:. 4, Columns:. 7, linecontent: sys❌3:. 3: SYS: / dev: / bin / SH

Print using printf Alternatively, makes the code more compact and easy to read
awk -F ':' '{printf ( "filename:% 10s, linenumber:% s, columns:% s, linecontent:% s \ n", FILENAME, NR , NF, $ 0)} ' / etc / passwd

printf print and
awk provides both print and printf functions of the two kinds of printout.

Wherein the print parameter function may be variable, number or string. String must be enclosed in double quotes, separated by a comma. If there is no comma, parameters can not be distinguished in series together. Here, the role of the role of the comma delimited file and the output is the same, but the latter is only a space.

printf function, its usage and c language printf substantially similar, may be formatted string, output complex, easier to use printf, the code more understandable.

awk programming
variables and assignment

In addition to the built-in variables awk, awk can also customize variables.

The following statistics / etc / passwd account number

awk '{COUNT ++; Print $ 0;} the END {Print "User COUNT IS", COUNT}' / etc / the passwd
root❌0: 0: the root: / the root: / bin / the bash
...
User COUNT IS 40
COUNT custom variables . Before the action {} there is only one print, in fact, just print a statement, action {} may have multiple statements to; number separated.

There is no initialization count, although the default is 0, the appropriate approach is initialized to 0:

awk ‘BEGIN {count=0;print "[start]user count is ", count} {count=count+1;print $0;} END{print "[end]user count is ", count}’ /etc/passwd
[start]user count is 0
root❌0:0:root:/root:/bin/bash

[end]user count is 40

Number of bytes in a file folder statistics occupied

ls -l |awk ‘BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size}’
[end]size is 8657198

If the display unit to M:

LS -l | awk 'the BEGIN {size = 0;} {size = size + $. 5;} the END {Print "[End] size IS", size / 1024/1024, "M"}'
[End] size IS 8.25889 M
Note , statistics do not include subdirectories folders.

Conditional statements

awk conditional statements are borrowed from the C language, see the following statement by:

Copy the code
copy the code
IF (expression The) {
Statement;
Statement;
... ...
}

if (expression) {
statement;
} else {
statement2;
}

IF (expression The) {
statement1;
} the else IF (expression1 The) {
statement2 is;
} {the else
of statement3 in the;
}
copy the code
copy the code

The number of bytes occupied by the files in a folder at the statistics, 4096 filter size of the file (usually a folder):

ls -l |awk ‘BEGIN {size=0;print "[start]size is ", size} {if($5!=4096){size=size+$5;}} END{print "[end]size is ", size/1024/1024,“M”}’
[end]size is 8.22339 M

loop statement

awk in the same loop borrowed from the C language support while, do / while, for, break, continue, and C language semantics identical semantics of these keywords.

Array

Since the awk subscript of an array may be numbers and letters, subscript of an array is often referred key (key). Values ​​and keys are stored in the interior of a table for key / value hash's application. Since the hash is not stored sequentially, so when the show will find an array of content, they are not displayed as you expect out of order. And an array of variables, are automatically created when using, awk will also automatically determine which stores digital or string. Generally, the array awk used to gather information from the record, may be used to calculate the sum of the number, and statistical tracking template word to be matched and the like.

Show / etc / passwd account

Copy the code
copy the code
awk -F ':' 'BEGIN { count = 0;} {name [count] = $ 1; count ++;}; END {for (i = 0; i <NR; i ++) print i, name [i ]} '/ etc / the passwd
0 the root
. 1 daemon
2 bin
. 3 SYS
. 4 Sync
. 5 games
...
copy the code
copy the code
used herein for loop through the array

Awk programming content very much, listed here only simple common usage, please refer http://www.gnu.org/software/gawk/manual/gawk.html more

Published 101 original articles · won praise 25 · views 5111

Guess you like

Origin blog.csdn.net/weixin_40083227/article/details/103987945